Intel Speed Shift is Intel's technology to control frequency in
hardware, with hints from software.
Submitted by: bwidawsk
Differential D18028
Add support for Intel Speed Shift scottph on Nov 18 2018, 1:09 AM. Authored by Tags None Referenced Files
Details
Diff Detail
Event TimelineThere are a very large number of changes, so older changes are hidden. Show Older Changes Comment Actions BTW, this booted fine on my X1 today (I just pulled the git branch). I think it's worth getting the current type of approach in for now, but eventually this cpufreq design needs to be rethought as it's a mess. Realistically you want something like this to happen for each CPU:
I think the way I would do this is to kill the info-only ACPI perf driver by having the acpi_cpu.c driver for cpuX devices fetch the info from _CST or whatever it is and provide that as an ivar on the cpuX device itself. Then cpuX would add a single "cpufreq" driver and the various cpufreq drivers would all end up sharing that devclass I think. I think this would mostly kill off p4tcc as it's priority is such that it would never win (same with acpi_throttle), but that is probably ok. If we really cared about still supporting both systems that did both P-states and T-states then we would basically create two device nodes under the CPU (one for P-states and one for T-states) and let the drivers attach to the relevant one. However, I would probably not bother with that. It seems like too much work for little gain. Leaving the T-state drivers around would let them still be used for CPUs without P-states, we would just lose the ability to mix the two (which is probably a feature anyway).
Comment Actions After Yamagi brought this work to my attention and told me that it was possible to apply this patch to FreeBSD 12.0 I created this diff from the git repo. I applied the patch to FreeBSD 12.0 on a T470s. With this patch the cores clock independently, power consumption is reduced (especially with low to medium load) and peak performance improved. During short bursts e.g. loading youtube.com in chromium all cores jump to the full boost clock (in my case 3.5GHz) or 100MHz below that for long enough to load bloated websites before falling back to a modest 0,9GHz to 1,1GHz. This improves responsiveness a lot over what is achievable with powerd++ or powerd. With powerd++ I have to limit max CPU frequency to 200MHz below the 2,8GHz base clock to avoid triggering ACPI thermal shutdown warnings under sustained load. With the hwpstate patch applied it takes mprime with 4 workers to trigger the same warning (make -sj4 buildworld isn't enough) , but the system remains stable. My system may be a special problem case because Lenovo saw fit to reduce the max CPU temperature to just 75C for the T470s. Comment Actions I just cherry picked this into my tree to try on some servers. As a *USER*, I have the following comments:
machdep.intel_speed_shift.default_epp=X Comment Actions Oh, and in the documentation, it would be handy to note that BIOSes will sometimes disable this functionality and you may need to enable or disable certain common settings in the BIOS. Comment Actions Works fine on the Google Pixelbook (i7-7Y75) (where the old EST does not work because there aren't any acpi_perf devices).
Comment Actions
Not done:
Comment Actions Scott, thanks for picking this up! I'm really glad to see this being driven towards commit. I also really appreciate the cf_dev -> dev revert; keeping name changes divorced from functional changes reduces the size of the patch and increases clarity of what has changed. Some line comments below, but here are the highlights summarized:
Thanks again. This will be great to get into the tree.
Comment Actions
Comment Actions Likewise, thanks for the speedy reviews all.
Comment Actions Thanks, this is kinda weird but I like it better than before. One thing that doesn't seem to have been done yet but needs to be for this to work is a module dependency on intel_hwpstate by est (if we are retaining separate .ko klds). Otherwise the dynamic linker won't look for the now-public identify routine in the other driver (it only searches the explicit dependencies for external symbols). It would also just be fine to compile both into a composite kld, I think. Either way, there may be some missing build glue here.
Comment Actions I would leave the identify routine uncommented in the hwpstate DEVMETHODs. It doesn't hurt to run it twice.
Comment Actions The new patch doesn't have context, so phabricator doesn't know how to show inter-diff and us casual readers cannot expand adjacent code in the web interface. Would you mind uploading a diff with full context (i.e., diff -U99999 ... or using the arc utility)? Thanks!
Comment Actions Looks good to me, thanks!
Comment Actions Actually, it gets broken by suspend/resuming once. % doas sysctl dev.hwpstate_intel.{0..3}.epp=50 dev.hwpstate_intel.0.epp: 89 -> 49 […] After a suspend/resume cycle, it's no longer adjustable: % doas sysctl dev.hwpstate_intel.{0..3}.epp=90 dev.hwpstate_intel.0.epp: 50 -> 50 Comment Actions I've tried the patch on a recent current and added some printf-debug. The output of rdmsr_safe(MSR_IA32_HWP_REQUEST, &requested); changes after the suspend: after zzz: |