Page MenuHomeFreeBSD

hmp(4): introduce Heterogeneous MultiProcessing support
Needs ReviewPublic

Authored by minsoochoo0122_proton.me on Tue, Apr 21, 10:21 AM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, May 11, 5:36 PM
Unknown Object (File)
Mon, May 11, 1:19 PM
Unknown Object (File)
Mon, May 11, 11:50 AM
Unknown Object (File)
Mon, May 11, 11:45 AM
Unknown Object (File)
Sun, May 10, 4:45 PM
Unknown Object (File)
Fri, May 8, 12:06 AM
Unknown Object (File)
Wed, May 6, 9:49 PM
Unknown Object (File)
Mon, May 4, 11:27 AM

Details

Reviewers
imp
olce
adrian
Summary

Add initial support for HMP.

The hmp(4) framework sits between scheduler and providers. It aims to
forward information from provider to scheduler to enable hybrid
scheduling.

Support for controlling hardware from scheduler (e.g. Arm SCMI) will be
added later.

For now, disable this option by default and enable when it becomes
stable enough.

Sponsored by: FreeBSD Foundation

Signed-off-by: Minsoo Choo <minsoochoo0122@proton.me>

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 73140
Build 70023: arc lint + arc unit

Event Timeline

Changes from D56547:

  • Moved from struct pcpu to DPCPU so we can avoid options HMP on pcpu.
  • Reworked scores system. Since scheduler only cares about perf and eff scores, we don't need other capabilities.
  • Capacities don't need atomic operation since it is written once on boot.
  • Added sysctls.

@koinec_yahoo.co.jp Could you please work on intelhfi based on this patch series?

Thank you very much for improving the HMP code and creating the manual.

Understood. I will proceed with supporting this version.

  • Remove dynamic and throttle flags as they are not needed anymore
  • Add hmp_lowest_capacity_cpu()
  • Remove dynamic and throttle flags as they are not needed anymore

Thank you very much for the improvements.
Is it possible to retain the flag field and the definition of the "Throttled" flag?

If the intention behind removing it is that it might not be immediately used in the ULE scheduler, I understand the reasoning. I am currently learning about the ULE scheduler while testing intelhfi. Since ULE focuses on distributing loads to the lowest-load CPUs while remaining cache-aware, I agree that the Throttled flag isn't strictly necessary for ULE itself.

However, I view HMP(4) as a forward-looking mechanism designed to share scores across heterogeneous multicore architectures, rather than something optimized solely for the ULE scheduler. Also, considering that the Intel Hardware Feedback Interface defines a "Throttled" state, I assume this flag was initially introduced with Arm architecture development in mind.
Therefore, notification of the "Throttled" state could become essential in the future for managing heterogeneous multicore environments. For this reason, I would like to request keeping the Throttled state configurable rather than removing it.

On the other hand, since "Capacity" represents a static capability, dynamic updates would be handled by a Provider that sets the "Score." Thus, I agree that the "Dynamic" flag can be safely removed.

Additionally, the Intel Hardware Feedback Interface defines a state for notifying power-efficient cores (such as LP-E cores) to aggregate all tasks. Specifically, when the HFI table's Efficiency score reaches its maximum value of 255, it requests aggregating all tasks to that specific core to maximize battery efficiency. This is conceptually the opposite of the "Throttled" state.
To support this behavior, would it be possible to add a "Consolidated" flag to the flag field?

  • Remove dynamic and throttle flags as they are not needed anymore

Thank you very much for the improvements.
Is it possible to retain the flag field and the definition of the "Throttled" flag?

If the intention behind removing it is that it might not be immediately used in the ULE scheduler, I understand the reasoning. I am currently learning about the ULE scheduler while testing intelhfi. Since ULE focuses on distributing loads to the lowest-load CPUs while remaining cache-aware, I agree that the Throttled flag isn't strictly necessary for ULE itself.

However, I view HMP(4) as a forward-looking mechanism designed to share scores across heterogeneous multicore architectures, rather than something optimized solely for the ULE scheduler. Also, considering that the Intel Hardware Feedback Interface defines a "Throttled" state, I assume this flag was initially introduced with Arm architecture development in mind.
Therefore, notification of the "Throttled" state could become essential in the future for managing heterogeneous multicore environments. For this reason, I would like to request keeping the Throttled state configurable rather than removing it.

Currently the sole goal for hmp(4) is integrating with the ULE scheduler (or any other future scheduler). When I added the throttled flag, I thought it would be useful for thread placement so the scheduler can avoid placing thread to throttled core. But this comes with two drawbacks:

  1. Assume we have two core arm64 board and buildworld (or do any heavy work) on it. Then both cores are likely to be flagged as throttled. The scheduler won't assign thread to ether core, but that's impossible. Otherwise, the scheduler will conclude that all cores are throttled and place the thread at a random core, but then we wasted O(n) time to make that conclusion. (O(n) for 2c board won't be a huge issue, but imagine this is happening to a 256c server)
  2. Throttled should be a part of thermal subsystem like cpufreq. For Arm specifically, I believe writing cpufreq_scmi driver would be better for handling throttled flag. You might ask what if we control the cores through hmp(4), but then we have two subsystems (cpufreq, hmp) for same work which introduces another layer of uncertainty.

Additionally, the Intel Hardware Feedback Interface defines a state for notifying power-efficient cores (such as LP-E cores) to aggregate all tasks. Specifically, when the HFI table's Efficiency score reaches its maximum value of 255, it requests aggregating all tasks to that specific core to maximize battery efficiency. This is conceptually the opposite of the "Throttled" state.
To support this behavior, would it be possible to add a "Consolidated" flag to the flag field?

hmp(4) subsystem works like a greatest common divisor. In other words, it only abstracts what seems to be common across many providers. But this seems to be intelhfi specific.

Moving a bulk of tasks at a time is generally avoided because thread replacement without reason is bad (e.g. cache miss, TLB miss, etc). And even without making a bulk replacement, if a core's efficiency score is high enough, the tasks should be balanced gradually over time according to the algorithm, so I don't see any reason why the aggregation needs to happen at the same time. But as you said, if it's good for battery efficiency, it's worth discussing with people interested in schedulers and laptop projects.