Details

Reviewers

None

Group Reviewers

Commits

rS338359: Allow multiple FBT probes to share a tracepoint.

Summary

With ifuncs, it's possible for multiple FBT probes to resolve to the
same address. For instance, fbt::copyout:entry and
fbt::copyout_nosmap:entry are the same on one of my systems: the former
is an ifunc and resolves to copyout_nosmap().

fbt_invop() isn't prepared to handle this possibility: it returns after
calling dtrace_probe() for the first matching tracepoint it finds. In
my case it always finds fbt::copyout_nosmap:entry first, so if I enable
fbt::copyout:entry, it never fires.

Add a hackish fix: flag tracepoints as having an associated enabling so
that we know we can skip over them in some cases. This has a few
caveats: we need an IPI to synchronize the flag with threads performing
FBT tracepoint hash table lookups, so disabling probes is more expensive
than before, and if one enables multiple probes sharing a tracepoint,
only one of those probes will ever fire. I will try to implement a
better fix, but this is better than nothing for 12.0.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

markj created this revision.Aug 27 2018, 4:10 PM

Harbormaster completed remote builds in B19215: Diff 47349.Aug 27 2018, 4:10 PM

markj added a reviewer: DTrace.Aug 27 2018, 4:16 PM

I'm a little worried that this may cause additional confusion if someone expects multiple probes to fire at that point. I wonder if a better workaround would be to just add a warning message whenever DTrace is called, or perhaps when someone calls dtrace -l, in the man page or something along the lines until we fix? Is there a reason we can't fix this in 13 and MFC the changes?

In D16921#360754, @domagoj.stolfa_gmail.com wrote:

I'm a little worried that this may cause additional confusion if someone expects multiple probes to fire at that point. I wonder if a better workaround would be to just add a warning message whenever DTrace is called, or perhaps when someone calls dtrace -l, in the man page or something along the lines until we fix?

Why "additional"? The problem exists regardless since fbt_invop() returns after the first matching tracepoint. The problem was invisible before ifuncs since before that all tracepoints had distinct addresses. This patch just hides the problem in the common case.

Is there a reason we can't fix this in 13 and MFC the changes?

No reason except that I would like "dtrace -n fbt::copyout:entry" to do the right thing in 12.0.

Why "additional"? The problem exists regardless since fbt_invop() returns after the first matching tracepoint. The problem was invisible before ifuncs since before that all tracepoints had distinct addresses. This patch just hides the problem in the common case.

Ah, my bad I've misread your initial description. My understanding was that any tracepoint other than the first one would not even show in dtrace -l so it couldn't be enabled in the first place. Understanding it properly now, I tend to agree with this change.

I think it would be nicer to instead keep all tracepoints for a given address on a linked list, with a single hash table entry for all of them. fbt_invop() would invoke dtrace_probe() for each probe associated with the tracepoint, which should be fine so long as there aren't many. (With ifuncs there will be 2.)

In D16921#360785, @markj wrote:

I think it would be nicer to instead keep all tracepoints for a given address on a linked list, with a single hash table entry for all of them. fbt_invop() would invoke dtrace_probe() for each probe associated with the tracepoint, which should be fine so long as there aren't many. (With ifuncs there will be 2.)

Perhaps we could even bunch them into a bounded array (or a vector-like implementation) and check it at fbt load-time in order to avoid indirection? A linked list is unlikely to be too painful given all the membars and indirection happening in dynamic variable implementation -- but nonetheless would degrade performance :-).

Fix the problem a different way: link probes that share a tracepoint
together using a new linkage pointer in struct fbt_probe. When the
tracepoint is hit, make all probes on that tracepoint fire.

This complicates fbt_destroy() a bit.

Herald added a subscriber: emaste. · View Herald TranscriptAug 27 2018, 6:12 PM

Harbormaster completed remote builds in B19219: Diff 47353.Aug 27 2018, 6:12 PM

In D16921#360786, @domagoj.stolfa_gmail.com wrote:

In D16921#360785, @markj wrote:

I think it would be nicer to instead keep all tracepoints for a given address on a linked list, with a single hash table entry for all of them. fbt_invop() would invoke dtrace_probe() for each probe associated with the tracepoint, which should be fine so long as there aren't many. (With ifuncs there will be 2.)

Perhaps we could even bunch them into a bounded array (or a vector-like implementation) and check it at fbt load-time in order to avoid indirection? A linked list is unlikely to be too painful given all the membars and indirection happening in dynamic variable implementation -- but nonetheless would degrade performance :-).

I think it's acceptable for the time being since, as I pointed out, there are at most 2 probes for a given tracepoint currently. The FBT hash table is itself cache-unfriendly, and I'd rather fix the performance problems there holistically. Here's the distribution of hash chain lengths on a system with a fairly stripped-down kernel:

In D16921#360817, @markj wrote:
In D16921#360786, @domagoj.stolfa_gmail.com wrote:

In D16921#360785, @markj wrote:

I think it would be nicer to instead keep all tracepoints for a given address on a linked list, with a single hash table entry for all of them. fbt_invop() would invoke dtrace_probe() for each probe associated with the tracepoint, which should be fine so long as there aren't many. (With ifuncs there will be 2.)

Perhaps we could even bunch them into a bounded array (or a vector-like implementation) and check it at fbt load-time in order to avoid indirection? A linked list is unlikely to be too painful given all the membars and indirection happening in dynamic variable implementation -- but nonetheless would degrade performance :-).

I think it's acceptable for the time being since, as I pointed out, there are at most 2 probes for a given tracepoint currently. The FBT hash table is itself cache-unfriendly, and I'd rather fix the performance problems there holistically. Here's the distribution of hash chain lengths on a system with a fairly stripped-down kernel:
5395 0
9755 1
8748 2
5244 3
2369 4
 876 5
 280 6
  73 7
  24 8
   3 9
   1 10

The strategy sounds good to me. Out of curiosity -- which one has 10?!

Remove incorrect claim about cpucontrol(8)-based updates.

Harbormaster completed remote builds in B19226: Diff 47362.Aug 27 2018, 8:07 PM

Restore old diff.

Harbormaster completed remote builds in B19241: Diff 47387.Aug 28 2018, 4:38 PM

This revision was not accepted when it landed; it landed in state Needs Review.Aug 28 2018, 8:21 PM

Closed by commit rS338359: Allow multiple FBT probes to share a tracepoint. (authored by markj). · Explain Why

This revision was automatically updated to reflect the committed changes.

Herald added subscribers: andrew, imp. · View Herald TranscriptAug 28 2018, 8:21 PM

Diff 47399

View Options

head/sys/cddl/dev/fbt/aarch64/fbt_isa.c

Show First 20 Lines • Show All 146 Lines • ▼ Show 20 Lines	again:
* We have a winner!		* We have a winner!
*/		*/
fbt = malloc(sizeof (fbt_probe_t), M_FBT, M_WAITOK \| M_ZERO);		fbt = malloc(sizeof (fbt_probe_t), M_FBT, M_WAITOK \| M_ZERO);
fbt->fbtp_name = name;		fbt->fbtp_name = name;
if (retfbt == NULL) {		if (retfbt == NULL) {
fbt->fbtp_id = dtrace_probe_create(fbt_id, modname,		fbt->fbtp_id = dtrace_probe_create(fbt_id, modname,
name, FBT_RETURN, 3, fbt);		name, FBT_RETURN, 3, fbt);
} else {		} else {
retfbt->fbtp_next = fbt;		retfbt->fbtp_probenext = fbt;
fbt->fbtp_id = retfbt->fbtp_id;		fbt->fbtp_id = retfbt->fbtp_id;
}		}
retfbt = fbt;		retfbt = fbt;

fbt->fbtp_patchpoint = instr;		fbt->fbtp_patchpoint = instr;
fbt->fbtp_ctl = lf;		fbt->fbtp_ctl = lf;
fbt->fbtp_loadcnt = lf->loadcnt;		fbt->fbtp_loadcnt = lf->loadcnt;
fbt->fbtp_symindx = symindx;		fbt->fbtp_symindx = symindx;
Show All 14 Lines

View Options

head/sys/cddl/dev/fbt/arm/fbt_isa.c

Show First 20 Lines • Show All 159 Lines • ▼ Show 20 Lines	again:
* We have a winner!		* We have a winner!
*/		*/
fbt = malloc(sizeof (fbt_probe_t), M_FBT, M_WAITOK \| M_ZERO);		fbt = malloc(sizeof (fbt_probe_t), M_FBT, M_WAITOK \| M_ZERO);
fbt->fbtp_name = name;		fbt->fbtp_name = name;
if (retfbt == NULL) {		if (retfbt == NULL) {
fbt->fbtp_id = dtrace_probe_create(fbt_id, modname,		fbt->fbtp_id = dtrace_probe_create(fbt_id, modname,
name, FBT_RETURN, 2, fbt);		name, FBT_RETURN, 2, fbt);
} else {		} else {
retfbt->fbtp_next = fbt;		retfbt->fbtp_probenext = fbt;
fbt->fbtp_id = retfbt->fbtp_id;		fbt->fbtp_id = retfbt->fbtp_id;
}		}
retfbt = fbt;		retfbt = fbt;

fbt->fbtp_patchpoint = instr;		fbt->fbtp_patchpoint = instr;
fbt->fbtp_ctl = lf;		fbt->fbtp_ctl = lf;
fbt->fbtp_loadcnt = lf->loadcnt;		fbt->fbtp_loadcnt = lf->loadcnt;
fbt->fbtp_symindx = symindx;		fbt->fbtp_symindx = symindx;
Show All 14 Lines

View Options

head/sys/cddl/dev/fbt/fbt.h

	Show All 28 Lines
	* Use is subject to license terms.			* Use is subject to license terms.
	*/			*/

	#ifndef _FBT_H_			#ifndef _FBT_H_
	#define _FBT_H_			#define _FBT_H_

	#include "fbt_isa.h"			#include "fbt_isa.h"

				/*
				* fbt_probe is a bit of a misnomer. One of these structures is created for
				* each trace point of an FBT probe. A probe might have multiple trace points
				* (e.g., a function with multiple return instructions), and different probes
				* might have a trace point at the same address (e.g., GNU ifuncs).
				*/
	typedef struct fbt_probe {			typedef struct fbt_probe {
	struct fbt_probe *fbtp_hashnext;			struct fbt_probe fbtp_hashnext; / global hash table linkage */
				struct fbt_probe fbtp_tracenext; / next probe for tracepoint */
				struct fbt_probe fbtp_probenext; / next tracepoint for probe */
				int fbtp_enabled;
	fbt_patchval_t *fbtp_patchpoint;			fbt_patchval_t *fbtp_patchpoint;
	int8_t fbtp_rval;			int8_t fbtp_rval;
	fbt_patchval_t fbtp_patchval;			fbt_patchval_t fbtp_patchval;
	fbt_patchval_t fbtp_savedval;			fbt_patchval_t fbtp_savedval;
	uintptr_t fbtp_roffset;			uintptr_t fbtp_roffset;
	dtrace_id_t fbtp_id;			dtrace_id_t fbtp_id;
	const char *fbtp_name;			const char *fbtp_name;
	modctl_t *fbtp_ctl;			modctl_t *fbtp_ctl;
	int fbtp_loadcnt;			int fbtp_loadcnt;
	int fbtp_symindx;			int fbtp_symindx;
	struct fbt_probe *fbtp_next;
	} fbt_probe_t;			} fbt_probe_t;

	struct linker_file;			struct linker_file;
	struct linker_symval;			struct linker_symval;
	struct trapframe;			struct trapframe;

	int fbt_invop(uintptr_t, struct trapframe *, uintptr_t);			int fbt_invop(uintptr_t, struct trapframe *, uintptr_t);
	void fbt_patch_tracepoint(fbt_probe_t *, fbt_patchval_t);			void fbt_patch_tracepoint(fbt_probe_t *, fbt_patchval_t);
	Show All 16 Lines

View Options

head/sys/cddl/dev/fbt/fbt.c

Show First 20 Lines • Show All 150 Lines • ▼ Show 20 Lines
fbt_doubletrap(void)		fbt_doubletrap(void)
{		{
fbt_probe_t *fbt;		fbt_probe_t *fbt;
int i;		int i;

for (i = 0; i < fbt_probetab_size; i++) {		for (i = 0; i < fbt_probetab_size; i++) {
fbt = fbt_probetab[i];		fbt = fbt_probetab[i];

for (; fbt != NULL; fbt = fbt->fbtp_next)		for (; fbt != NULL; fbt = fbt->fbtp_probenext)
fbt_patch_tracepoint(fbt, fbt->fbtp_savedval);		fbt_patch_tracepoint(fbt, fbt->fbtp_savedval);
}		}
}		}

static void		static void
fbt_provide_module(void arg, modctl_t lf)		fbt_provide_module(void arg, modctl_t lf)
{		{
char modname[MAXPATHLEN];		char modname[MAXPATHLEN];
Show All 32 Lines	fbt_provide_module(void arg, modctl_t lf)

/*		/*
* List the functions in the module and the symbol values.		* List the functions in the module and the symbol values.
*/		*/
(void) linker_file_function_listall(lf, fbt_provide_module_function, modname);		(void) linker_file_function_listall(lf, fbt_provide_module_function, modname);
}		}

static void		static void
fbt_destroy(void arg, dtrace_id_t id, void parg)		fbt_destroy_one(fbt_probe_t *fbt)
{		{
fbt_probe_t fbt = parg, next, hash, last;		fbt_probe_t hash, hashprev, *next;
modctl_t *ctl;
int ndx;		int ndx;

do {
ctl = fbt->fbtp_ctl;

ctl->fbt_nentries--;

/*
* Now we need to remove this probe from the fbt_probetab.
*/
ndx = FBT_ADDR2NDX(fbt->fbtp_patchpoint);		ndx = FBT_ADDR2NDX(fbt->fbtp_patchpoint);
last = NULL;		for (hash = fbt_probetab[ndx], hashprev = NULL; hash != NULL;
hash = fbt_probetab[ndx];		hash = hash->fbtp_hashnext, hashprev = hash) {
		if (hash == fbt) {
while (hash != fbt) {		if ((next = fbt->fbtp_tracenext) != NULL)
ASSERT(hash != NULL);		next->fbtp_hashnext = hash->fbtp_hashnext;
last = hash;		else
hash = hash->fbtp_hashnext;		next = hash->fbtp_hashnext;
		if (hashprev != NULL)
		hashprev->fbtp_hashnext = next;
		else
		fbt_probetab[ndx] = next;
		goto free;
		} else if (hash->fbtp_patchpoint == fbt->fbtp_patchpoint) {
		for (next = hash; next->fbtp_tracenext != NULL;
		next = next->fbtp_tracenext) {
		if (fbt == next->fbtp_tracenext) {
		next->fbtp_tracenext =
		fbt->fbtp_tracenext;
		goto free;
}		}

if (last != NULL) {
last->fbtp_hashnext = fbt->fbtp_hashnext;
} else {
fbt_probetab[ndx] = fbt->fbtp_hashnext;
}		}
		}
next = fbt->fbtp_next;		}
		panic("probe %p not found in hash table", fbt);
		free:
free(fbt, M_FBT);		free(fbt, M_FBT);
		}

		static void
		fbt_destroy(void arg, dtrace_id_t id, void parg)
		{
		fbt_probe_t fbt = parg, next;
		modctl_t *ctl;

		do {
		ctl = fbt->fbtp_ctl;
		ctl->fbt_nentries--;

		next = fbt->fbtp_probenext;
		fbt_destroy_one(fbt);
fbt = next;		fbt = next;
} while (fbt != NULL);		} while (fbt != NULL);
}		}

static void		static void
fbt_enable(void arg, dtrace_id_t id, void parg)		fbt_enable(void arg, dtrace_id_t id, void parg)
{		{
fbt_probe_t *fbt = parg;		fbt_probe_t *fbt = parg;
Show All 11 Lines	if (fbt_verbose) {
printf("fbt is failing for probe %s "		printf("fbt is failing for probe %s "
"(module %s reloaded)",		"(module %s reloaded)",
fbt->fbtp_name, ctl->filename);		fbt->fbtp_name, ctl->filename);
}		}

return;		return;
}		}

for (; fbt != NULL; fbt = fbt->fbtp_next)		for (; fbt != NULL; fbt = fbt->fbtp_probenext) {
fbt_patch_tracepoint(fbt, fbt->fbtp_patchval);		fbt_patch_tracepoint(fbt, fbt->fbtp_patchval);
		fbt->fbtp_enabled++;
}		}
		}

static void		static void
fbt_disable(void arg, dtrace_id_t id, void parg)		fbt_disable(void arg, dtrace_id_t id, void parg)
{		{
fbt_probe_t *fbt = parg;		fbt_probe_t fbt = parg, hash;
modctl_t *ctl = fbt->fbtp_ctl;		modctl_t *ctl = fbt->fbtp_ctl;

ASSERT(ctl->nenabled > 0);		ASSERT(ctl->nenabled > 0);
ctl->nenabled--;		ctl->nenabled--;

if ((ctl->loadcnt != fbt->fbtp_loadcnt))		if ((ctl->loadcnt != fbt->fbtp_loadcnt))
return;		return;

for (; fbt != NULL; fbt = fbt->fbtp_next)		for (; fbt != NULL; fbt = fbt->fbtp_probenext) {
		fbt->fbtp_enabled--;

		for (hash = fbt_probetab[FBT_ADDR2NDX(fbt->fbtp_patchpoint)];
		hash != NULL; hash = hash->fbtp_hashnext) {
		if (hash->fbtp_patchpoint == fbt->fbtp_patchpoint) {
		for (; hash != NULL; hash = hash->fbtp_tracenext)
		if (hash->fbtp_enabled > 0)
		break;
		break;
		}
		}
		if (hash == NULL)
fbt_patch_tracepoint(fbt, fbt->fbtp_savedval);		fbt_patch_tracepoint(fbt, fbt->fbtp_savedval);
}		}
		}

static void		static void
fbt_suspend(void arg, dtrace_id_t id, void parg)		fbt_suspend(void arg, dtrace_id_t id, void parg)
{		{
fbt_probe_t *fbt = parg;		fbt_probe_t *fbt = parg;
modctl_t *ctl = fbt->fbtp_ctl;		modctl_t *ctl = fbt->fbtp_ctl;

ASSERT(ctl->nenabled > 0);		ASSERT(ctl->nenabled > 0);

if ((ctl->loadcnt != fbt->fbtp_loadcnt))		if ((ctl->loadcnt != fbt->fbtp_loadcnt))
return;		return;

for (; fbt != NULL; fbt = fbt->fbtp_next)		for (; fbt != NULL; fbt = fbt->fbtp_probenext)
fbt_patch_tracepoint(fbt, fbt->fbtp_savedval);		fbt_patch_tracepoint(fbt, fbt->fbtp_savedval);
}		}

static void		static void
fbt_resume(void arg, dtrace_id_t id, void parg)		fbt_resume(void arg, dtrace_id_t id, void parg)
{		{
fbt_probe_t *fbt = parg;		fbt_probe_t *fbt = parg;
modctl_t *ctl = fbt->fbtp_ctl;		modctl_t *ctl = fbt->fbtp_ctl;

ASSERT(ctl->nenabled > 0);		ASSERT(ctl->nenabled > 0);

if ((ctl->loadcnt != fbt->fbtp_loadcnt))		if ((ctl->loadcnt != fbt->fbtp_loadcnt))
return;		return;

for (; fbt != NULL; fbt = fbt->fbtp_next)		for (; fbt != NULL; fbt = fbt->fbtp_probenext)
fbt_patch_tracepoint(fbt, fbt->fbtp_patchval);		fbt_patch_tracepoint(fbt, fbt->fbtp_patchval);
}		}

static int		static int
fbt_ctfoff_init(modctl_t lf, linker_ctf_t lc)		fbt_ctfoff_init(modctl_t lf, linker_ctf_t lc)
{		{
const Elf_Sym *symp = lc->symtab;;		const Elf_Sym *symp = lc->symtab;;
const ctf_header_t hp = (const ctf_header_t ) lc->ctftab;		const ctf_header_t hp = (const ctf_header_t ) lc->ctftab;
▲ Show 20 Lines • Show All 847 Lines • Show Last 20 Lines

View Options

head/sys/cddl/dev/fbt/mips/fbt_isa.c

View Options

head/sys/cddl/dev/fbt/powerpc/fbt_isa.c

View Options

head/sys/cddl/dev/fbt/riscv/fbt_isa.c

View Options

head/sys/cddl/dev/fbt/x86/fbt_isa.c

Attempt to support multiple tracepoints with the same address.
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 47399

head/sys/cddl/dev/fbt/aarch64/fbt_isa.c

head/sys/cddl/dev/fbt/arm/fbt_isa.c

head/sys/cddl/dev/fbt/fbt.h

head/sys/cddl/dev/fbt/fbt.c

head/sys/cddl/dev/fbt/mips/fbt_isa.c

head/sys/cddl/dev/fbt/powerpc/fbt_isa.c

head/sys/cddl/dev/fbt/riscv/fbt_isa.c

head/sys/cddl/dev/fbt/x86/fbt_isa.c

Attempt to support multiple tracepoints with the same address.ClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 47399

head/sys/cddl/dev/fbt/aarch64/fbt_isa.c

head/sys/cddl/dev/fbt/arm/fbt_isa.c

head/sys/cddl/dev/fbt/fbt.h

head/sys/cddl/dev/fbt/fbt.c

head/sys/cddl/dev/fbt/mips/fbt_isa.c

head/sys/cddl/dev/fbt/powerpc/fbt_isa.c

head/sys/cddl/dev/fbt/riscv/fbt_isa.c

head/sys/cddl/dev/fbt/x86/fbt_isa.c

Attempt to support multiple tracepoints with the same address.
ClosedPublic
Actions

Revision Contents
Changeset List