Page MenuHomeFreeBSD

Fix hwpmc "stalled" behavior
ClosedPublic

Authored by jtl on Nov 10 2015, 8:24 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Jun 26, 12:14 AM
Unknown Object (File)
Fri, Jun 21, 5:51 PM
Unknown Object (File)
Fri, Jun 21, 10:40 AM
Unknown Object (File)
Fri, Jun 21, 9:31 AM
Unknown Object (File)
May 16 2024, 4:41 PM
Unknown Object (File)
Feb 28 2024, 10:50 AM
Unknown Object (File)
Feb 28 2024, 9:02 AM
Unknown Object (File)
Dec 20 2023, 2:37 AM
Subscribers

Details

Summary

Currently, there is a single pm_stalled flag that tracks whether a
performance monitor was "stalled" due to insufficent ring buffer
space for samples. However, because the same performance monitor
can run on multiple processes or threads at the same time, a single
pm_stalled flag that impacts them all seems insufficient.

In particular, you can hit corner cases where the code fails to stop
performance monitors during a context switch out, because it thinks
the performance monitor is already stopped. However, in reality,
it may be that only the monitor running on a different CPU was stalled.

This patch attempts to fix that behavior by tracking on a per-CPU basis
whether a PM desires to run and whether it is "stalled". This lets the
code make better decisions about when to stop PMs and when to try to
restart them. Ideally, we should avoid the case where the code fails
to stop a PM during a context switch out.

Sponsored by: Juniper Networks
Approved by: gnn (mentor) [Assuming he does...]
MFC after: 1 month

Test Plan

'make buildkernel' succeeds.

I ran a sampling performance monitor against a multi-threaded application
with a large workflow to force stalls to occur. The statistics showed that
stalls did occur, but sampling resumed.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jtl retitled this revision from to Fix hwpmc "stalled" behavior.
jtl updated this object.
jtl edited the test plan for this revision. (Show Details)
jtl added reviewers: gnn, sjg, jhb, bz.

Please commit with
Approved by: gnn (mentor)

sys/dev/hwpmc/hwpmc_mod.c
4446 ↗(On Diff #10105)

Is this an old bug that it would stop unconditionally here (even without making the flags per-CPU)?

4482 ↗(On Diff #10105)

Pity we don't have some sort of 'CPU_READANDCLEAR(cpu, &set)', but it would probably require an atomic_cmpset loop to implement. I believe that you are assuming that for a given CPU the bit doesn't require atomic updates, but that you have to use atomics only for the sake of the bits for other CPUs? That is, if you were to instead have made 'int pm_stalled[MAXCPU]' you could use non-atomic ops on 'pm_stalled[curcpu]', but the use of cpusets is to minimize storage overhead? If so, having the test and clear in separate ops is fine.

sys/dev/hwpmc/hwpmc_mod.c
4446 ↗(On Diff #10105)

Yes. It was at least inconsistent for it to try to stop it without checking the pm_stalled state.

4482 ↗(On Diff #10105)

Yes, that is correct: this is just to minimize storage overhead. Otherwise, these could just be arrays of booleans. (And, maybe they should be, if the cost of storage is cheaper than the cost of atomic operations on large-scale systems. However, I followed the other code in the file, which handles per-CPU bit masks this way.)

If we hit the true conditional here, we should have already been running on this CPU (due to a context switch in). Because the cpustate bit is only set during a context switch in (which shouldn't occur during this point), it shouldn't matter if we have a break between checking the cpustate bit, clearing the cpustate bit, and checking the stalled state.

jhb edited edge metadata.
This revision is now accepted and ready to land.Nov 12 2015, 6:16 PM
gnn edited edge metadata.
jtl marked 4 inline comments as done.Nov 14 2015, 1:23 AM
This revision was automatically updated to reflect the committed changes.