Page MenuHomeFreeBSD

x86: Fix scheduler topology assumptions about uniformity
Needs ReviewPublic

Authored by cem on Thu, Feb 13, 2:48 AM.

Details

Reviewers
jeff
rlibby
kib
Summary

This could be triggered by taking a 2-socket system and disabling all cores
in socket 0, except the BSP, resulting in a non-uniform cache hierarchy.

There is no reason we cannot make reasonable scheduling decisions based on
the cache-locality information we already have in such configurations, and
we shouldn't panic on boot.

Submitted by: Mike Evans (earlier version)

Test Plan

Without this change, hitting sched_setup_smp() in sched_ule on a non-uniform
x86 topology, such as socket 1 + only BSP enabled in socket 0, resulted in
smp_topo_find() == NULL and panic("Can't find cpu group for...").

Diff Detail

Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 29338
Build 27239: arc lint + arc unit

Event Timeline

cem created this revision.Thu, Feb 13, 2:48 AM
jeff added a comment.Sun, Feb 16, 9:59 AM

So this looks to me like we made a bad topology and then worked around it by weakening the search. This should still make the leaf topology node for the lone cpu. Otherwise ULE will make bad decisions.

Can you show me the boot verbose or sysctl kern.sched.topology_spec from this configuration?

cem added a comment.EditedSun, Feb 16, 4:31 PM
In D23659#520598, @jeff wrote:

So this looks to me like we made a bad topology and then worked around it by weakening the search.

Yeah, it seems so.

This should still make the leaf topology node for the lone cpu. Otherwise ULE will make bad decisions.
Can you show me the boot verbose or sysctl kern.sched.topology_spec from this configuration?

I don't have the former anymore; the latter I guess we could get with this change to avoid panic. I'd have to ask MikeE, I don't have direct access to this system.

If it helps, the machine is a quad-socket with 24 cores (maybe threads) per socket; however, all CPUs on sockets 0, 2, and 3 have been disabled, except for the BSP thread on socket 0. So it's a 25-core non-uniform configuration; the idea is to measure NUMA latency from socket 1 to devices on socket 0. When the BSP's sibling hyperthread is left enabled, the reported panic goes away.

jeff added a comment.Sun, Feb 16, 9:00 PM

Ok, yes, the issue is we never created a leaf for this lone cpu.

The problem with this is that ULE will assume all of the other CPUs in the system are in the leaf with the BSP. I haven't reviewed the amd64 code in some time but it probably assumes that a cpu with no neighbors is already in a set above with peers. When you add HTT it has one peer and they share something different than the existing set. It should be fixed there.