Page MenuHomeFreeBSD

powerd: Changes to adapt this for high core count servers
Needs ReviewPublic

Authored by gallatin on Jun 13 2022, 3:42 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Oct 10, 3:10 AM
Unknown Object (File)
Sun, Oct 5, 4:57 AM
Unknown Object (File)
Sat, Sep 20, 7:11 PM
Unknown Object (File)
Sep 4 2025, 8:42 PM
Unknown Object (File)
Aug 23 2025, 8:52 AM
Unknown Object (File)
Aug 22 2025, 10:56 AM
Unknown Object (File)
Aug 21 2025, 12:29 AM
Unknown Object (File)
Jul 26 2025, 5:32 PM
Subscribers

Details

Summary
  • Add an option (-t) to disable the use of special "turbo" frequencies that overclock some CPUs
  • Add an option (-S) to scale the load to 100%

When not scaled, the load factor as calculated by
powerd ranges from 0..100*num_cpus. This means that the
default algorithms which scale based on load in the 0..100
range fail to work properly, as just a single loaded core
looks like 100% CPU to powered, when it can be just 1.5% of
64 cores. Scaling the load factor by the number of cpus
allows the standard algorithms to work sanely on high core count
servers.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

I'm assuming you're adding this because it's something useful for Netflix servers? Looks fine to me but I think a note of warning about serial workloads is in order since many desktop systems have e.g. 8 threads but often only one busy process.

usr.sbin/powerd/powerd.8
138

I would add "Note that with this flag enabled, powerd will keep the clock speed low on systems with serial workloads." or something to that effect.

Updated man page text, as suggested by Colin Percival

I'm assuming you're adding this because it's something useful for Netflix servers? Looks fine to me but I think a note of warning about serial workloads is in order since many desktop systems have e.g. 8 threads but often only one busy process.

Yes, it helps our power usage quite a bit. We might be a unique use case in that our workload is highly parallel. I've updated the man page with your suggested text!

Are you sure that "turbo" detection via checking for 1 in the last decimal digit is right? It's being a while since I read it and I may be wrong, but I remember its being specified as being 1MHz higher than nominal frequency, that theoretically may be not multiple of 10. I am thinking whether comparing frequencies of two first P-states would be a better solution.

Over the years people many times proposed alike change for frequency scaling, and I always discouraged it. I have no strong objections if it fits your workload and properly discouraged in the man page, but it would be good to invent something more universal. For example, we could try to naively detect single-threaded workloads by looking not only on total system load, but on load of 2-3 most busy CPUs, etc. It may not work in case of active forking and context switching, but at least may detect long running single-threaded workloads.

In D35467#804583, @mav wrote:

Are you sure that "turbo" detection via checking for 1 in the last decimal digit is right? It's being a while since I read it and I may be wrong, but I remember its being specified as being 1MHz higher than nominal frequency, that theoretically may be not multiple of 10. I am thinking whether comparing frequencies of two first P-states would be a better solution.

This is a very good idea. This works on the only examples of CPUs I have to test on, but +1 from the nominal frequency sounds more resilient. Thank you!

Over the years people many times proposed alike change for frequency scaling, and I always discouraged it. I have no strong objections if it fits your workload and properly discouraged in the man page, but it would be good to invent something more universal. For example, we could try to naively detect single-threaded workloads by looking not only on total system load, but on load of 2-3 most busy CPUs, etc. It may not work in case of active forking and context switching, but at least may detect long running single-threaded workloads.

Thank you.. At least for now, this minimal patch is enough to solve the problem at hand for our workload..

Updated the check for turbo frequencies, as suggested by mav.

usr.sbin/powerd/powerd.c
220

I think you are over-engineering here. The frequencies should be sorted and turbo should be the first. You could just check that first is 1MHz bigger than the second and call it done.

Thanks, I know that's how it is on my system, but I wasn't sure it was sorted the same direction, etc, on all systems. These constraints make it far simpler.