Effectively all i386 kernels now have two pmaps compiled in: one managing PAE pagetables, and another non-PAE. The implementation is selected at cold time depending on the CPU features. The vm_paddr_t is always 64bit now. As result, nx bit can be used on all capable CPUs.
Option PAE only affects the bus_addr_t: it is still 32bit for non-PAE configs, for drivers compatibility. Kernel layout, esp. max kernel address, low memory PDEs and max user address (same as trampoline start) are now same for PAE and for non-PAE regardless of the type of page tables used.
Non-PAE kernel (when using PAE pagetables) can handle physical memory up to 24G now, larger memory requires re-tuning the KVA consumers and instead the code caps the maximum at 24G. Unfortunately, a lot of drivers do not use busdma(9) properly so by default even 4G barrier is not easy. There are two tunables added: hw.above4g_allow and hw.above24g_allow, the first one might be worth to try, second is only for dev use.
i386 now creates three freelists if there is any memory above 4G, to allow proper bounce pages allocation. Also, VM_KMEM_SIZE_SCALE changed from 3 to 1.
Tested by: pho