Page MenuHomeFreeBSD

Add intel_rapl driver
Needs RevisionPublic

Authored by thj on Mon, May 4, 3:07 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, May 19, 2:25 AM
Unknown Object (File)
Sun, May 17, 8:12 AM
Unknown Object (File)
Thu, May 14, 5:52 AM
Unknown Object (File)
Wed, May 13, 5:28 PM
Unknown Object (File)
Tue, May 12, 5:15 PM
Unknown Object (File)
Tue, May 12, 2:50 PM
Unknown Object (File)
Tue, May 12, 9:42 AM
Unknown Object (File)
Mon, May 11, 2:27 PM
Subscribers

Details

Summary

Running Average Power Limit (RAPL) is an Intel technology which allows reading
measurements of Core and System power usage a run time with a low
overhead.

RAPL can be used to instrument power usage for individual function
calls, but it is also useful at a lower granularity as a tool for
understanding system power usage.

RAPL can report core device power usage, uncore device power usage
(usually documented as the onboard graphics) and platform power usage
(everything attached to the core such as pcie devices).

RAPL can also report memory power usage.

RAPL can be used to set limits on power usage for the core or platform.
These limits can be used to restrict power consumption and set thermal
limits. Currently using limits is an outstanding item for this driver.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 72769
Build 69652: arc lint + arc unit

Event Timeline

thj requested review of this revision.Mon, May 4, 3:07 PM

what's the earliest chipset this supports? (eg the Atom C2xxx / C3xxx ones?)

olce requested changes to this revision.Tue, May 12, 8:03 PM

As is, this does not compile. I've proposed fixes in inline comments.

I'll review the meat tomorrow.

sys/modules/intel_rapl/Makefile
1–11

I do not follow what is done here. To me, most lines look superfluous, and the Makefile cannot possibly work as is.

Proposing a simpler alternative that should work.

After a bit more research: This is missing a change in sys/modules/Makefile:

diff --git a/sys/modules/Makefile b/sys/modules/Makefile                                                                                                                                                                                                                                                                                                                            
index faedb856977c..cbbe6cfb7044 100644                                                                                                                                                                                                                                                                                                                                             
--- a/sys/modules/Makefile                                                                                                                                                                                                                                                                                                                                                          
+++ b/sys/modules/Makefile                                                                                                                                                                                                                                                                                                                                                          
@@ -175,6 +175,7 @@ SUBDIR=     \                                                                                                                                                                                                                                                                                                                                                   
        ${_igc} \                                                                                                                                                                                                                                                                                                                                                                   
        imgact_binmisc \                                                                                                                                                                                                                                                                                                                                                            
        ${_imx} \                                                                                                                                                                                                                                                                                                                                                                   
+       ${_intel_rapl} \                                                                                                                                                                                                                                                                                                                                                            
        ${_intelspi} \                                                                                                                                                                                                                                                                                                                                                              
        ${_io} \                                                                                                                                                                                                                                                                                                                                                                    
        ${_ioat} \                                                                                                                                                                                                                                                                                                                                                                  
@@ -768,6 +769,7 @@ _et=                et                                                                                                                                                                                                                                                                                                                                          
 _ftgpio=       ftgpio                                                                                                                                                                                                                                                                                                                                                              
 _ftwd=         ftwd                                                                                                                                                                                                                                                                                                                                                                
 _exca=         exca                                                                                                                                                                                                                                                                                                                                                                
+_intel_rapl=   intel_rapl                                                                                                                                                                                                                                                                                                                                                          
 _io=           io                                                                                                                                                                                                                                                                                                                                                                  
 _itwd=         itwd                                                                                                                                                                                                                                                                                                                                                                
 _ix=           ix

without which the module is not compiled at all. With this change and the inline diff in this comment, the module compiles correctly.

sys/x86/power/intel_rapl.c
477

Compiler error here.

This revision now requires changes to proceed.Tue, May 12, 8:03 PM

I also see that sysctl knob handlers are not necessarily executed on the specific CPU the driver represents, so readings are taken from a random core. I suggest to look at what we did in sys/x86/cpufreq/hwpstate_amd.c (smp_rendezvous_cpu() generally, and for simpler cases not requiring specific consistency, x86_msr_op()).

what's the earliest chipset this supports? (eg the Atom C2xxx / C3xxx ones?)

This supports CPUs, not chipsets. Yes, the distinction is sometimes blurry, but the registers used here are (supposed to be) CPUs' ones and are documented in Intel's SDM, contrary to what is used in D54882 (which is mostly undocumented, except if you consider Linux code as documentation, but still that is not "official").

Comments about C2xx in D56790 give a partial answer to your question for the hardware part. Yes, these are supposed to be supported. To which extent, I'm not sure. SDM vol 4 answers that only the TDP can be retrieved (MSR_PKG_POWER_INFO), and a limit can be set (MSR_PKG_POWER_LIMIT), but it does not mention MSR_PKG_ENERGY_STATUS which reports the actual energy consumption (every msec). On the other hand, SDM vol 3 says that MSR_PKG_ENERGY_STATUS is "non-optional" for the PKG level, so we could expect it to be present given that MSR_PKG_POWER_LIMIT (also "non-optional") is mentioned present in vol 4.

So a definitive answer here can only be brought by actual tests.

As for the software part (this driver), in its current state it does not seem to support these, as e.g. I don't see model 0x4D being listed.

what's the earliest chipset this supports? (eg the Atom C2xxx / C3xxx ones?)

The earliest hardware supported by the first version of this driver is 6th generation. I have used the models from the SDM vol4 table "models in 6th-13th gen plus some other assorted stuff" chapter.

thj marked an inline comment as done.Wed, May 13, 2:11 PM
In D56791#1305613, @thj wrote:

what's the earliest chipset this supports? (eg the Atom C2xxx / C3xxx ones?)

The earliest hardware supported by the first version of this driver is 6th generation. I have used the models from the SDM vol4 table "models in 6th-13th gen plus some other assorted stuff" chapter.

Olce convinced me and I'll generalise support a bit more so we can avoid the model checks