Page MenuHomeFreeBSD

Improve accuracy of PMC sampling frequency
AbandonedPublic

Authored by jtl on Nov 10 2015, 7:20 PM.
Tags
None
Referenced Files
F103597971: D4122.diff
Tue, Nov 26, 11:19 PM
Unknown Object (File)
Mon, Nov 25, 5:19 AM
Unknown Object (File)
Sat, Nov 23, 8:25 PM
Unknown Object (File)
Sat, Nov 23, 5:54 AM
Unknown Object (File)
Wed, Nov 20, 4:47 PM
Unknown Object (File)
Wed, Nov 20, 1:51 AM
Unknown Object (File)
Thu, Nov 7, 4:48 AM
Unknown Object (File)
Oct 19 2024, 7:32 PM
Subscribers

Details

Reviewers
sjg
jhb
gnn
bz
Summary

The code tracks a counter which is the number of events until the next
sample. On context switch in, it loads the saved counter. On context
switch out, it tries to calculate a new saved counter.

Problems:

  1. The saved counter was shared by all threads in a process. However, this

means that all threads would be initially loaded with the same saved
counter. However, that could result in sampling more often than once every
X number of events.

  1. The calculation to determine a new saved counter was backwards. It

added when it should have subtracted, and subtracted when it should have
added. Assume a single-threaded process with a reload count of 1000 events.
Assuming the counter on context switch in was 100 and the counter on context
switch out was 50 (meaning the thread has "consumed" 50 more events), the
code would calculate a new saved counter of 150 (instead of the proper 50).

Fix:

  1. As soon as the saved counter is used to initialize a monitor for a

thread on context switch in, set the saved counter to the reload count.
That way, subsequent threads to use the saved counter will get the full
reload count, assuring we sample at least once every X number of events
(across all threads).

  1. Change the calculation of the saved counter. Due to the change to the

saved counter in #1, we simply need to add (modulo the reload count) the
remaining counter time we retrieve from the CPU when a thread is context
switched out.

Sponsored by: Juniper Networks
Approved by: gnn (mentor) [Assuming he does, of course. ;-)]
MFC after: 1 month

Test Plan

Prior to applying this patch, I saw unexpected variations (sometimes large)
in the number of samples gathered by HWPMC.

After applying this patch, the number of samples gathered by HWPMC became
more stable.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 1098
Build 1102: arc lint + arc unit

Event Timeline

jtl retitled this revision from to Improve accuracy of PMC sampling frequency.
jtl updated this object.
jtl edited the test plan for this revision. (Show Details)
jtl added reviewers: sjg, gnn, jhb, bz.
gnn edited edge metadata.
This revision is now accepted and ready to land.Nov 12 2015, 2:27 AM
sys/dev/hwpmc/hwpmc_mod.c
1279

Is this comment stale now since it still says "per-process"?

1285

Nit: style(9) prefers full sentences for comments, and a newline before a new comment. I would perhaps expand the comment further along the lines of:

/*
  * Use the saved value from the most recent thread switch out to start
  * this counter.  Reset the saved count so in case another thread from this
  * process switches in before any threads switch out.
  */

I think this still highlights a bit of a flaw in the approach though. Suppose you had a process with two threads. Suppose the reload count is N. Suppose that for whatever reason, both threads A and B always trigger N - 1 events during each "slice" between context switches.
You can then have this timeline:

  • Thread A starts with an initial count of N.
  • Thread B starts with an initial count of N.
  • A switches out after N - 1 events, so no sample is logged during this slice, and the saved count is set to 1.
  • B switches out after N - 1 events, so no sample is logged either.

foo:

  • A resumes with a count of 1. This resets the saved count to N.
  • B resumes with a count of N.
  • A switches out after N - 1 events, (1 sample logged) and the saved count is set to 2.
  • B switches out after N - 1 events, no samples logged, saved count set to 1.
  • goto foo and repeat

The end result in this pathological case is that B would never log any samples. I'm not sure how you can fix this apart from having per-thread saved counts. Part of the problem is that once you get in the loop, A keeps resuming with B's saved count (which is a condition that can occur in the existing algorithm as well).

1438

This is definitely clearer than the old approach, I almost made a similar change the last time I looked at this as well as the current code is non-obvious. If pp_pmcval were per-thread you'd think you could just do 'pp_pmcval = newvalue' directly.

sys/dev/hwpmc/hwpmc_mod.c
1279

It isn't really stale; we're still tracking this on a per-process basis. (Although, see my later comments on threads.)

1438

If pp_pmcval were per-thread ...

I made this change to improve the algorithm, but I definitely think the per-thread approach is ultimately better (albeit more complex in other ways).

If you'd like, I can work up such a change instead. I have a chunk of the code from another change I was considering.

sys/dev/hwpmc/hwpmc_mod.c
1438

I think this is an improvement on its own, but if per-thread pmcvals is feasible I think that would be good to implement as a follow-up change.

jtl edited edge metadata.

Revised comment per @jhb's suggestion.

This revision now requires review to proceed.Nov 14 2015, 1:13 AM
jtl marked 6 inline comments as done.Nov 14 2015, 1:14 AM