jemalloc obtains its page-size at compile time. It does not work when the compiled in page size is smaller than the runtime page size. It has an assertion that the kernel's page size is no larger than the compiled-in page size. This prevents booting a kernel with 16KB pages without building a matching world. It also makes downgrading to a 4k world when booted into a 16k kernel "fun". And it makes running pre-built 4K static binaries from packages (or other sources, like golang) impossible (eg, pkg-static).
However, jemalloc runs just fine with the compiled-in page size is larger than the runtime page size.
To make interoperability between 16k and 4k, I'd like to increase the compiled in page size of jemalloc to 16k. This will waste a small amount of space, but the payoff is making a 16k kernel much easier to use. For some workloads (like static web serving), 16k pages show an up to 25% performance improvement in my testing.