Remove the secondary_stacks array in arm64 and riscv kernels.
Instead, dynamically allocate a page for the boot stack of each AP when
starting them up, like we do on x86.  This shrinks the bss by
MAXCPU*KSTACK_PAGES pages, which corresponds to 4MB on arm64 and 256KB
on riscv.
Duplicate the logic used on x86 to free the bootstacks, by using a
sysinit to wait for each AP to switch to a thread before freeing its
stack.
While here, mark some static MD variables as such.
Reviewed by:	kib
MFC after:	1 month
Sponsored by:	Juniper Networks, Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D24158