Enable NUMA support in AMD64 GENERIC
AbandonedPublic
Actions

Authored by cem on Jan 25 2018, 1:47 AM.

Details

Reviewers

jhb
jeff
markj

Diff Detail

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 14749
Build 14870: arc lint + arc unit

Event Timeline

cem created this revision.Jan 25 2018, 1:47 AM

I'm not opposed although I was thinking about doing this after some further changes and testing. Right now the default allocation policy is round-robin and there are some perf regressions.

I'm not sure we need to override the architecture default MAXMEMDOM in the config file. There's not much value for it to be under 64 on AMD because the domainset vector is a long wide.

In D14037#294811, @jeff wrote:

I'm not opposed although I was thinking about doing this after some further changes and testing. Right now the default allocation policy is round-robin and there are some perf regressions.

Sure, no rush. We could restore the prior default allocation policy first.

I'm not sure we need to override the architecture default MAXMEMDOM in the config file. There's not much value for it to be under 64 on AMD because the domainset vector is a long wide.

Sure. Should the default MAXMEMDOM in machine/include/param.h be changed from 8 to 64?

kbowling added a subscriber: kbowling.Jan 25 2018, 5:22 PM

I'm not sure what or how to measure, but this seemed to have greatly improved interactivity on my dual ivy bridge with nvidia blob. I saw a similar desktop responsiveness report on -current ML.

In D14037#294812, @cem wrote:

In D14037#294811, @jeff wrote:

I'm not opposed although I was thinking about doing this after some further changes and testing. Right now the default allocation policy is round-robin and there are some perf regressions.

Sure, no rush. We could restore the prior default allocation policy first.

That is probably not unreasonable.

I'm not sure we need to override the architecture default MAXMEMDOM in the config file. There's not much value for it to be under 64 on AMD because the domainset vector is a long wide.

Sure. Should the default MAXMEMDOM in machine/include/param.h be changed from 8 to 64?

The only downside is the static vm_dom[] array which could be allocated at boot. Then there's nothing but the bitset that grows with maxmemdom.

If you wanted to fix these two and commit the enable patch I would feel better about it.

Remove MAXMEMDOM from conf; adjust header default up to 64 to match bits-per-word.

Revert default policy to first-touch.

jeff added inline comments.Feb 6 2018, 10:16 PM

sys/amd64/conf/GENERIC
103	This can also be committed with the current MAXMEMDOM.
sys/amd64/include/param.h
77	Ok the downside to this is two very large arrays that are statically allocated. vm_dom[MAXMEMDOM] and vm_phys_free_queues[MAXMEMDOM] both will take too much memory on common systems.
sys/kern/kern_cpuset.c
1343	You can go ahead and commit this alone.

cem added inline comments.Feb 6 2018, 11:06 PM

sys/amd64/include/param.h
77	struct vm_domain is only 896 bytes. 56kB waste is... some, but not especially huge on amd64 systems. struct vm_freelist is 24 bytes; nfreepool, list, and order are 2, 3, 13 respectively. So that's 118kB of waste, which is in the same ballpark. Leaving this at 8 gives a more reasonable ~19kB waste total on single domain systems. Have we seen NUMA systems with more than 8 domains?
sys/kern/kern_cpuset.c
1343	I saw it's in your project/numa branch; that might merge soon anyway?

jeff added inline comments.Feb 12 2018, 10:08 PM

sys/amd64/include/param.h
77	8 is probably the max right now. Two domains on each cpu in a 4 socket system. 16 would be pretty unusual but possible with an 8 socket system. I'm fine leaving it for now. I think these structures are very likely to bloat further with other optimizations which is why it's on my mind.
sys/kern/kern_cpuset.c
1343	I'm cherry picking patches one at a time. This one could go now. Since you've already got an approved review for it would you mind?