I'm finding that it's useful to have static global domainsets for use in
kernel allocators (malloc(), kmem_malloc(), UMA, kstacks). Add some
plumbing to make that easy:
- Add a vm_phys routine that can be invoked by MD code once the NUMA topology is discovered. Right now this is only used by the SRAT/SLIT parser, but will likely be used by at least the FDT code on non-ACPI systems in the future.
- Predefined domainsets are allocated from .data instead of from the domainset zone. This lets us use them in UMA without introducing a bootstrapping problem. This results in a bit of bloat, but not much: struct domainset is 40 bytes with LP64, and MAXMEMDOM is small on all platforms.
- Predefine domainset_roundrobin and domainset_prefer.
Did you looked at the code generated ? Compiler might spill 'generation' into memory and then reload it after invpcid, not keeping the intermediary in a register. I expect it more from clang and less from gcc, the later usually generates better code.
And such single-use variable deserves a comment, since my first action when looking at making any changes to such code, is to remove the once-used aliases.
It sounds as if you want to rewrite the handlers back into assembler.