While making NUMA amd64 w/o MD_SMALL_ALLOC boot I noticed a number of numa bugs but also weaker than required alignment.  In practice the alignment of zones was actually dictated by the alignment of the per-cpu cache structure.  This just clarifies existing behavior while simplifying code.