Page MenuHomeFreeBSD

hwpstate_amd(4): Factor out setting the CPPC_REQUEST register
ClosedPublic

Authored by olce on Sat, Jan 31, 11:49 AM.
Tags
None
Referenced Files
F145006586: D55008.diff
Sat, Feb 14, 11:39 PM
F145001823: D55008.diff
Sat, Feb 14, 10:29 PM
Unknown Object (File)
Sat, Feb 14, 8:20 AM
Unknown Object (File)
Sat, Feb 14, 6:08 AM
Unknown Object (File)
Wed, Feb 4, 1:25 AM
Unknown Object (File)
Tue, Feb 3, 10:55 PM
Unknown Object (File)
Sat, Jan 31, 8:54 PM
Unknown Object (File)
Sat, Jan 31, 7:39 PM
Subscribers

Details

Summary

In preparation for creating other knobs to tweak values in this register
beyond just the EPP (Efficiency/Performance Preference).

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

olce requested review of this revision.Sat, Jan 31, 11:49 AM

LGTM with a comment.

sys/x86/cpufreq/hwpstate_amd.c
407

Just some preference. Can we move this line into the caller's for loop and inline this function
it is somehow faster since request structure is allocated on the stack of set_cppc_request.

This revision is now accepted and ready to land.Sat, Feb 7, 2:45 AM
sys/x86/cpufreq/hwpstate_amd.c
407

Not sure I understand what you mean here, so let me rephrase. You'd like setting the sc field in the caller set_cppc_request() instead of here, is that it? And inlining set_cppc_request_send_one()?

If so, then there are two points of view:

  1. The textual code one: There's a kind of balance here. I usually prefer that common code is factored out. Setting sc depends on the device (so, the CPU) to which the request is sent, so it makes sense to handle that in the sub-function rather than in the caller. IMHO, this also makes the intent of the code of the caller slightly clearer. At the same time, we are preparing part of the request structure in the caller, so that structure is only partially filled before calling the sub-function. Given the sub-function is really "internal" to the caller and small, I don't think that's a problem per se, but I can add some comment in the caller to make the intent explicit.
  2. The performance one: I do not think this kind of performance difference is relevant in practice, both because it's considerably smaller than the machinery of sending an IPI to another CPU, and also because this is not performance critical at all (extremely infrequent uses). As said in an earlier revision, clang inlines a lot without directives, especially small functions. That said, I've checked again and it does something slightly different here: It inlines setc_cppc_request() into sysctl_cppc_request_field_handler() (the successor to sysctl_epp_handler(), see next revisions) but keeps actual calls to set_cppc_request_send_one(). Adding an explicit inline annotation makes it inline that sub-function too. So going to add an inline on commit, is that good enough for you?

If we really want to optimize something here, then we should be looking at sending a single IPI to multiple CPUs. I have already considered doing that, but did not as other things looked more pressing. That would be more efficient, but I'm not sure that would make any practical difference even for hundreds of cores. If we do that, then we could remove set_cppc_request_send_one() as a separate function (as only one use would remain).

olce marked an inline comment as done.Sat, Feb 7, 9:36 AM
sys/x86/cpufreq/hwpstate_amd.c
407

Performace is not a big deal in here. I just want the caller fill all member of the request data. But I am ok without changing this.

olce marked an inline comment as done.Mon, Feb 9, 8:48 AM
olce added inline comments.
sys/x86/cpufreq/hwpstate_amd.c
407

Adding a comment at the point of initialization of data about sc being filled by set_cppc_request_send_one().

olce marked an inline comment as done.Mon, Feb 9, 11:09 AM