Page MenuHomeFreeBSD

new x86 smp topology detection code
ClosedPublic

Authored by avg on Jun 3 2015, 5:36 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Nov 9, 9:53 AM
Unknown Object (File)
Sat, Nov 9, 3:22 AM
Unknown Object (File)
Sep 27 2024, 5:43 AM
Unknown Object (File)
Sep 23 2024, 4:44 AM
Unknown Object (File)
Sep 18 2024, 4:09 AM
Unknown Object (File)
Sep 13 2024, 12:18 AM
Unknown Object (File)
Sep 8 2024, 4:12 PM
Unknown Object (File)
Sep 7 2024, 8:34 PM
Subscribers

Details

Summary
  • based on APIC ID derivation rules for Intel and AMD CPUs
  • can handle non-uniform topologies
  • requires homogeneous APIC ID assignment (same bit widths for ID components)
  • doesn't yet handle dual-node AMD CPUs
  • supports only package/core/cache nodes

Todo:

  • AMD dual-node processors
  • NUMA nodes
  • AMD Bulldozer module nodes (?)
  • checking for homogeneity of the APIC ID assignment across packages
  • more flexible cache placement within topology

Long term todo:

  • sharing of other resources like FPU

The new code adds all cpu caches to the scheduling topology.
Previously we excluded caches that contained only a single cpu
or those that covered all cpus in a system (e.g. a top-level
cache of a single package system).

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage

Event Timeline

avg retitled this revision from to new x86 smp topology detection code.
avg updated this object.
avg edited the test plan for this revision. (Show Details)

I have two requests for the code before it is read. Both are related to the comments and not the code itself.

First, please add at least short explanations for the functions, stating the intent of the operation performed. E.g., is topo_analyze() only extracts counters, or also performs the validation ?

Second, it would be very helpful both for review and for following work with the code, to have references to exact document revisions and sections for the implemented algorithms. I am aware of the Intel whitepaper about proper implementation of the topology reading, but have no idea for anything close for AMD.

Also, probably unfounded, feel from just skimming over the code, is that it seriously lacks any assertions.

I don't know much about cpuid in part of topology, but if this patch is what I've tested couple years ago, then I am all for it from consumer point of view. :)

sys/kern/subr_smp.c
58

Is the extra endif/ifdef needed here?

sys/x86/x86/mp_x86.c
243

Is it correct to have loop without limit here? If we ever increase MAX_CACHE_LEVELS, won't it confuse old CPUs?

sys/kern/subr_smp.c
58

No, it's not really needed.

sys/x86/x86/mp_x86.c
243

According to AMD CPUID Specification the last defined value for this CPUID function must always be zero. That would translate to cache type of zero which add_deterministic_cache() recognizes as a special value and that ensures termination of this loop. So, this loop iterates over all caches reported by a CPU. In other words, I do not see how MAX_CACHE_LEVELS may affect this loop.

mav edited edge metadata.

Looks good to me. Tried it on 2x Intel E5-2690v2, AMD Athlon(tm) II X2 240 and AMD A10-7850K. In all cases detected topology looks reasonable.

This revision is now accepted and ready to land.Apr 2 2016, 1:39 PM
This revision was automatically updated to reflect the committed changes.