Page MenuHomeFreeBSD

Feature enhancements to pmcstat
AbandonedPublic

Authored by kmacy on May 3 2018, 1:07 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Mar 31, 2:24 AM
Unknown Object (File)
Mar 17 2024, 11:00 AM
Unknown Object (File)
Mar 10 2024, 1:15 PM
Unknown Object (File)
Mar 3 2024, 1:36 AM
Unknown Object (File)
Feb 25 2024, 11:12 AM
Unknown Object (File)
Feb 25 2024, 9:50 AM
Unknown Object (File)
Feb 17 2024, 6:35 PM
Unknown Object (File)
Feb 9 2024, 2:23 PM

Details

Summary
  • Compile in perf counters table on supported arches, providing the following
      • short and long descriptions of all counters
      • easy to add counter definitions in user space
      • sensible default event rate sampling values
    • Add option to pmcstat to not resolve leaf function in top mode, but rather print the IP itself so one can easily identify the hot instruction -I
    • use perf counter table for sensible default sampling rate so we don't DOS the box we're sampling on if we don't know a sensible value for -n
  • increase the default update frequency of top mode so as to be less likely to be DOSd by UNHALTED_CORE_CYCLES (still possible, will require surgery to fix properly)

A couple of additional notes:

  • It may make more sense to compile the counter table in to libpmcstat
  • It would be nice to extend pmccontrol to print the descriptions provided by the table
  • Longer term it would be nice to add an interface to pmc so that we could just import updated tables from perf in the future rather than rely on the cumbersome and error prone header updates we rely on currently.

depends on sysctl added in D15155

ak@linux.intel.com
Thu, 03 May 2018 15:30:28 -0700
"mmacy@mattmacy.io" <mmacy@mattmacy.io>
> This seems fairly contrary to Intel's intent. Can you comment?

You're right they were intended to be BSD licensed. Should
fix the header. Thanks.

BTW the master copy of these files resides in
https://github.com/andikleen/pmu-tools and they are of course
still BSD licensed there. But the perf version has already
diverged a bit.

-Andi

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 16407

Event Timeline

kmacy edited the summary of this revision. (Show Details)
kmacy edited the summary of this revision. (Show Details)
kmacy edited the summary of this revision. (Show Details)

remove bogus license tags, the source is clearly BSD from Intel

How does an event description from the json tables is matched against the index from pmc_events.h ?

In D15275#322071, @kib wrote:

How does an event description from the json tables is matched against the index from pmc_events.h ?

It works so long as the FreeBSD version was named correctly. I have an aliases table pmu_utils.c for things like UNHALTED_CORE_CYCLES and LLC_MISSES. If the table lookup fails it will just use the default sampling rate that is used on HEAD. Ultimately, on supported architectures I'd like to switch from using the ad hoc defines in pmc_events.h to using the json tables from Intel, IBM, and Cavium.

Hi Andi -
Thanks for the prompt response. I will definitely be taking a look at the upstream version. All I've done so far is add table generation to the build process and use the generated tables to provide sane default sampling rates . https://reviews.freebsd.org/D15275

If you use it for a different tool I would recommend to not build the tables in,
but just use the libjevent parser to read the JSON tables at runtime.

My own tooling in pmu-tools even automatically downloads them from the Intel
download site. That will make it much easier for users to update them.
Other options would be to regularly fetch a bundle and ship that.

The building in was more a political compromise in perf.

Also I would expect libjevents have a few Linux'isms, I would
be open to patches to make at least the core JSON code more friendly
to other OS. There are also perf specific parts that are not
portable at all, but those could be perhaps easily excluded.

Also it's good that you have the events in user tools now, not
in kernel space anymore.

-Andi

In D15275#322071, @kib wrote:

How does an event description from the json tables is matched against the index from pmc_events.h ?

It works so long as the FreeBSD version was named correctly. I have an aliases table pmu_utils.c for things like UNHALTED_CORE_CYCLES and LLC_MISSES. If the table lookup fails it will just use the default sampling rate that is used on HEAD. Ultimately, on supported architectures I'd like to switch from using the ad hoc defines in pmc_events.h to using the json tables from Intel, IBM, and Cavium.

So did you verified that the name matches ? What are the plan for non-matching names ?

I agree (and do not see a problem with it) that our ad-hoc names should be changed to the Intel names.

Also I think that importing tables in userspace is really a half measure. Right now we must update both kernel and userspace to get new event added, and in the course of it we have to break pmc(4) ABI. IMO the tables should live in the kernel, it is not a problem when hwpmc(4) is a module, or hwpmc(4) can be a minimal core, with microarch-specific submodules loaded as needed. Userspace should fetch the table from kernel and use kernel handles for events.

In D15275#322358, @kib wrote:
In D15275#322071, @kib wrote:

How does an event description from the json tables is matched against the index from pmc_events.h ?

It works so long as the FreeBSD version was named correctly. I have an aliases table pmu_utils.c for things like UNHALTED_CORE_CYCLES and LLC_MISSES. If the table lookup fails it will just use the default sampling rate that is used on HEAD. Ultimately, on supported architectures I'd like to switch from using the ad hoc defines in pmc_events.h to using the json tables from Intel, IBM, and Cavium.

So did you verified that the name matches ? What are the plan for non-matching names ?

Mostly the names match. Switching to using the Intel names in the tables will fix it for good.

Also I think that importing tables in userspace is really a half measure. Right now we must update both kernel and userspace to get new event added, and in the course of it we have to break pmc(4) ABI. IMO the tables should live in the kernel, it is not a problem when hwpmc(4) is a module, or hwpmc(4) can be a minimal core, with microarch-specific submodules loaded as needed. Userspace should fetch the table from kernel and use kernel handles for events.

I don't agree and neither does Intel (see Andi's mail). We have a table of bits we can pass to the kernel as an ioctl. One could claim that defining them in the kernel buys some safety or compatibility guarantees, but that doesn't actually hold water in practice. Being able to add newly exposed PMCs without having to modify the kernel is a step forward, not a half measure. There are dozens if not hundreds of non-public PMCs on Zen processors. With this mechanism we could simply add a new table as opposed to laboriously and in error prone fashion copy from the docs.

In D15275#322358, @kib wrote:
In D15275#322071, @kib wrote:

How does an event description from the json tables is matched against the index from pmc_events.h ?

It works so long as the FreeBSD version was named correctly. I have an aliases table pmu_utils.c for things like UNHALTED_CORE_CYCLES and LLC_MISSES. If the table lookup fails it will just use the default sampling rate that is used on HEAD. Ultimately, on supported architectures I'd like to switch from using the ad hoc defines in pmc_events.h to using the json tables from Intel, IBM, and Cavium.

So did you verified that the name matches ? What are the plan for non-matching names ?

Mostly the names match. Switching to using the Intel names in the tables will fix it for good.

Also I think that importing tables in userspace is really a half measure. Right now we must update both kernel and userspace to get new event added, and in the course of it we have to break pmc(4) ABI. IMO the tables should live in the kernel, it is not a problem when hwpmc(4) is a module, or hwpmc(4) can be a minimal core, with microarch-specific submodules loaded as needed. Userspace should fetch the table from kernel and use kernel handles for events.

I don't agree and neither does Intel (see Andi's mail). We have a table of bits we can pass to the kernel as an ioctl. One could claim that defining them in the kernel buys some safety or compatibility guarantees, but that doesn't actually hold water in practice. Being able to add newly exposed PMCs without having to modify the kernel is a step forward, not a half measure. There are dozens if not hundreds of non-public PMCs on Zen processors. With this mechanism we could simply add a new table as opposed to laboriously and in error prone fashion copy from the docs.

Given how much trouble it is to continuously keep these definitions up to date and how often we've fallen behind I strongly support using an external definition. If we really wanted to, we could include a json parse/generate as part of the kernel build or as a checked-in autogenerated file that can be rebuilt with the module. I would only suggest this if it is possible to construct malicious counter definitions. I don't know if this is the case or not.

In D15275#324870, @jeff wrote:
In D15275#322358, @kib wrote:
In D15275#322071, @kib wrote:

How does an event description from the json tables is matched against the index from pmc_events.h ?

It works so long as the FreeBSD version was named correctly. I have an aliases table pmu_utils.c for things like UNHALTED_CORE_CYCLES and LLC_MISSES. If the table lookup fails it will just use the default sampling rate that is used on HEAD. Ultimately, on supported architectures I'd like to switch from using the ad hoc defines in pmc_events.h to using the json tables from Intel, IBM, and Cavium.

So did you verified that the name matches ? What are the plan for non-matching names ?

Mostly the names match. Switching to using the Intel names in the tables will fix it for good.

Also I think that importing tables in userspace is really a half measure. Right now we must update both kernel and userspace to get new event added, and in the course of it we have to break pmc(4) ABI. IMO the tables should live in the kernel, it is not a problem when hwpmc(4) is a module, or hwpmc(4) can be a minimal core, with microarch-specific submodules loaded as needed. Userspace should fetch the table from kernel and use kernel handles for events.

I don't agree and neither does Intel (see Andi's mail). We have a table of bits we can pass to the kernel as an ioctl. One could claim that defining them in the kernel buys some safety or compatibility guarantees, but that doesn't actually hold water in practice. Being able to add newly exposed PMCs without having to modify the kernel is a step forward, not a half measure. There are dozens if not hundreds of non-public PMCs on Zen processors. With this mechanism we could simply add a new table as opposed to laboriously and in error prone fashion copy from the docs.

Given how much trouble it is to continuously keep these definitions up to date and how often we've fallen behind I strongly support using an external definition. If we really wanted to, we could include a json parse/generate as part of the kernel build or as a checked-in autogenerated file that can be rebuilt with the module. I would only suggest this if it is possible to construct malicious counter definitions. I don't know if this is the case or not.

This is not the thing I responded to. I am fine with generating tables from the officially provided intel material. What is being discussed is where to keep the generated freebsd tables, in the kernel driver or in userspace library. My opinion was that we should generate kernel tables, and userspace should fetch these descriptions from kernel.

Matt saying that it is enough to keep the tables in userspace, and provide the kernel with the bits that must be loaded into MSR PERF_EVENTSEL or FIXED_CTRL when a counter is activated. There is indeed exists an Intel whitepaper 334467 which explains which bits might be safely taken from the userspace for these registers.

So suppose we go with the 'tables in userspace' proposal. Does the same sanity checking firewall for safe programming of the counters exists for other arches, and even for AMDs ? Or do we need to have the generic bit-passing interface for Intels, and keep the tables (both in user- and kernel- space ?) for other CPUs ?