Fix hwpmc "stalled" behavior
ClosedPublic
Actions

Authored by jtl on Nov 10 2015, 8:24 PM.

Details

Reviewers

sjg
jhb
gnn
bz

Commits

rS290811: Fix hwpmc "stalled" behavior

Summary

Currently, there is a single pm_stalled flag that tracks whether a
performance monitor was "stalled" due to insufficent ring buffer
space for samples. However, because the same performance monitor
can run on multiple processes or threads at the same time, a single
pm_stalled flag that impacts them all seems insufficient.

In particular, you can hit corner cases where the code fails to stop
performance monitors during a context switch out, because it thinks
the performance monitor is already stopped. However, in reality,
it may be that only the monitor running on a different CPU was stalled.

This patch attempts to fix that behavior by tracking on a per-CPU basis
whether a PM desires to run and whether it is "stalled". This lets the
code make better decisions about when to stop PMs and when to try to
restart them. Ideally, we should avoid the case where the code fails
to stop a PM during a context switch out.

Sponsored by: Juniper Networks
Approved by: gnn (mentor) [Assuming he does...]
MFC after: 1 month

Test Plan

'make buildkernel' succeeds.

I ran a sampling performance monitor against a multi-threaded application
with a large workflow to force stalls to occur. The statistics showed that
stalls did occur, but sampling resumed.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

jtl updated this revision to Diff 10105.Nov 10 2015, 8:24 PM

jtl retitled this revision from to Fix hwpmc "stalled" behavior.

jtl updated this object.

jtl edited the test plan for this revision. (Show Details)

jtl added reviewers: gnn, sjg, jhb, bz.

Herald added a subscriber: imp. · View Herald TranscriptNov 10 2015, 8:24 PM

Please commit with
Approved by: gnn (mentor)

jhb added inline comments.Nov 12 2015, 3:47 AM

sys/dev/hwpmc/hwpmc_mod.c
4446 ↗	(On Diff #10105)	Is this an old bug that it would stop unconditionally here (even without making the flags per-CPU)?
4482 ↗	(On Diff #10105)	Pity we don't have some sort of 'CPU_READANDCLEAR(cpu, &set)', but it would probably require an atomic_cmpset loop to implement. I believe that you are assuming that for a given CPU the bit doesn't require atomic updates, but that you have to use atomics only for the sake of the bits for other CPUs? That is, if you were to instead have made 'int pm_stalled[MAXCPU]' you could use non-atomic ops on 'pm_stalled[curcpu]', but the use of cpusets is to minimize storage overhead? If so, having the test and clear in separate ops is fine.

jtl added inline comments.Nov 12 2015, 2:07 PM

sys/dev/hwpmc/hwpmc_mod.c
4446 ↗	(On Diff #10105)	Yes. It was at least inconsistent for it to try to stop it without checking the pm_stalled state.
4482 ↗	(On Diff #10105)	Yes, that is correct: this is just to minimize storage overhead. Otherwise, these could just be arrays of booleans. (And, maybe they should be, if the cost of storage is cheaper than the cost of atomic operations on large-scale systems. However, I followed the other code in the file, which handles per-CPU bit masks this way.) If we hit the true conditional here, we should have already been running on this CPU (due to a context switch in). Because the cpustate bit is only set during a context switch in (which shouldn't occur during this point), it shouldn't matter if we have a break between checking the cpustate bit, clearing the cpustate bit, and checking the stalled state.