Fix arm64 TLB invalidation with non-4k pages
When using 16k or 64k pages atop will shift the address by more than
the needed amount for a tlbi instruction. Replace this with a new macro
to shift the address by 12 and use PAGE_SIZE in the for loop to let the
code work with any page size.
Reviewed by: alc, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34516
(cherry picked from commit 813738fabaaea43503724b8371faf5bab73a3047)