Page MenuHomeFreeBSD

x86: Fix scheduler topology assumptions about uniformity
AbandonedPublic

Authored by cem on Feb 13 2020, 2:48 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Apr 21, 11:12 AM
Unknown Object (File)
Sat, Apr 20, 2:25 AM
Unknown Object (File)
Dec 20 2023, 4:38 AM
Unknown Object (File)
Aug 5 2023, 12:24 PM
Unknown Object (File)
Jul 1 2023, 2:58 AM
Unknown Object (File)
Jun 22 2023, 6:31 PM
Unknown Object (File)
Jun 21 2023, 9:18 PM
Unknown Object (File)
May 9 2023, 1:31 AM
Subscribers
None

Details

Reviewers
jeff
rlibby
kib
Summary

This could be triggered by taking a 2-socket system and disabling all cores
in socket 0, except the BSP, resulting in a non-uniform cache hierarchy.

There is no reason we cannot make reasonable scheduling decisions based on
the cache-locality information we already have in such configurations, and
we shouldn't panic on boot.

Submitted by: Mike Evans (earlier version)

Test Plan

Without this change, hitting sched_setup_smp() in sched_ule on a non-uniform
x86 topology, such as socket 1 + only BSP enabled in socket 0, resulted in
smp_topo_find() == NULL and panic("Can't find cpu group for...").

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 29338
Build 27239: arc lint + arc unit

Event Timeline

So this looks to me like we made a bad topology and then worked around it by weakening the search. This should still make the leaf topology node for the lone cpu. Otherwise ULE will make bad decisions.

Can you show me the boot verbose or sysctl kern.sched.topology_spec from this configuration?

In D23659#520598, @jeff wrote:

So this looks to me like we made a bad topology and then worked around it by weakening the search.

Yeah, it seems so.

This should still make the leaf topology node for the lone cpu. Otherwise ULE will make bad decisions.

Can you show me the boot verbose or sysctl kern.sched.topology_spec from this configuration?

I don't have the former anymore; the latter I guess we could get with this change to avoid panic. I'd have to ask MikeE, I don't have direct access to this system.

If it helps, the machine is a quad-socket with 24 cores (maybe threads) per socket; however, all CPUs on sockets 0, 2, and 3 have been disabled, except for the BSP thread on socket 0. So it's a 25-core non-uniform configuration; the idea is to measure NUMA latency from socket 1 to devices on socket 0. When the BSP's sibling hyperthread is left enabled, the reported panic goes away.

Ok, yes, the issue is we never created a leaf for this lone cpu.

The problem with this is that ULE will assume all of the other CPUs in the system are in the leaf with the BSP. I haven't reviewed the amd64 code in some time but it probably assumes that a cpu with no neighbors is already in a set above with peers. When you add HTT it has one peer and they share something different than the existing set. It should be fixed there.