Move pcpu KVA out of .bss into dynamically allocated VA at pmap_bootstrap(). This avoids demoting superpage mapping .data/.bss. Also it makes possible to use pmap_qenter() for installation of domain-local pcpu page on NUMA configs.
Remove unecessary VM_ALLOC_ZERO from allocation of the domain-local page for pcpu.
Do not constrain allocations for doublefault, boot, and mce stacks. All these stacks are used only once (doublefault, boot) or very rare (mce).
Refactor pcpu and IST initialization by moving it to helper functions.
[Basically this is my previous patch, rebased on top of Jeff' commit, with unnecessary diffs reduced. I do not like extern struct pcpus declarations spread over sources, and have a plan to fix that after this commit lands]