Page MenuHomeFreeBSD

Make it possible to disable NUMA support with a tunable.
ClosedPublic

Authored by markj on Oct 5 2018, 7:31 PM.
Tags
None
Referenced Files
Unknown Object (File)
Apr 10 2024, 6:12 PM
Unknown Object (File)
Apr 10 2024, 5:06 PM
Unknown Object (File)
Apr 10 2024, 5:05 PM
Unknown Object (File)
Apr 2 2024, 11:31 PM
Unknown Object (File)
Mar 4 2024, 7:57 PM
Unknown Object (File)
Mar 4 2024, 7:57 PM
Unknown Object (File)
Mar 4 2024, 7:57 PM
Unknown Object (File)
Mar 4 2024, 7:57 PM

Details

Summary

This change depends on D17416. The idea is to provide a workaround for
anyone that's negatively impacted by NUMA being configured in GENERIC.
Setting vm.ndomains=1 in the loader overrides the NUMA topology
detection in srat.c.

Disabling NUMA this way does not completely remove the overhead
associated with this option. In particular, we will still go through the vm_domainset
iterator code on every memory allocation.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

markj added reviewers: alc, kib, jeff, jhb, gallatin, emaste.
sys/vm/vm_phys.c
609 ↗(On Diff #48802)

It would be slightly more friendly to user to fall back to either 1 or n in this case.

sys/x86/acpica/srat.c
538 ↗(On Diff #48802)

I would write this in one line using ?: as in acpi_map_pxp_to_vm_domain() below.

markj marked 2 inline comments as done.Oct 5 2018, 7:58 PM

Please break out the man page changes that are not related to the boot time tuneable into a seperate commit/differential.

share/man/man4/numa.4
34 ↗(On Diff #48802)

Does not MAXMEMDOM still take an interger argument? And I think the default is 8?

68 ↗(On Diff #48802)

To me this is the only change to the man page that belongs with this review, all the other changers have nothing to do with the code change in this review.

This revision is now accepted and ready to land.Oct 5 2018, 8:05 PM
sys/arm64/arm64/mp_machdep.c
579–581 ↗(On Diff #48804)

This looks wrong when vm_domains is 1, domain will be unset.

We'll also need to read the domain for the dual socket ThunderX and store it somewhere even in the single vm_domain case. This is because some cross socket interrupts don't work.

markj marked 2 inline comments as done.Oct 8 2018, 5:55 PM
markj added inline comments.
share/man/man4/numa.4
34 ↗(On Diff #48802)

It does. The default is platform-dependent.

sys/arm64/arm64/mp_machdep.c
579–581 ↗(On Diff #48804)

Hrmm. I would really like to be able to implement, e.g., pcpu_page_alloc() without an #ifdef NUMA.

Could you point me to the code which depends on pc_domain's value?

My only suggestion might be to consider an explicit "disabled" tunable. For SMP we have 'kern.smp.disabled=1'. Having 'vm.numa.disabled=1' or 'kern.numa.disabled=1' might be more intuitive than setting ndomains to 1. (The implementation might be that you set ndomains to 1 if the tunable is set, but it's the UI I'm thinking of).

markj marked an inline comment as done.
  • Fix vm_ndomains=1 case on arm64.
  • Change the tunable to vm.numa.disabled.
This revision now requires review to proceed.Oct 8 2018, 8:12 PM
markj marked an inline comment as done.Oct 8 2018, 8:13 PM
markj added inline comments.
sys/arm64/arm64/mp_machdep.c
579–581 ↗(On Diff #48804)

Are you sure that the value of pc_domain matters? I note that we read and store a numa-node-id property in the GICv3 code, and I'm able to boot a dual-socket ThunderX with this patch applied.

This revision is now accepted and ready to land.Oct 8 2018, 8:37 PM
alc added inline comments.
share/man/man4/numa.4
52–53 ↗(On Diff #48906)

You can shorten this to: "By default it attempts to balance allocations across the domains."

(As an aside, this sentence highlights the incomplete definition of NUMA above, specifically, that it does not mention the typical differences in interdomain versus intradomain bandwidth.)

sys/vm/vm_phys.c
608 ↗(On Diff #48906)

Style? d != 0

This revision was automatically updated to reflect the committed changes.