On platforms without a direct map, vm_map_insert() may in rare
situations need to allocate a kernel map entry in order to allocate
kernel map entries. This poses a problem similar to the one solved for
vmem boundary tags by vmem_bt_alloc(). In fact this problem is a bit
trickier since in the kernel map case we must allocate entries with the
kernel map locked, whereas vmem can recurse into itself because boundary
tags are allocated up-front. This diff tries to solve the problem.
The diff adds a custom slab allocator for kmapentzone which allocates
KVA directly from kernel_map, bypassing the kmem_ layer. This avoids
mutual recursion with the vmem btag allocator. Then, when
vm_map_insert() allocates a new kernel map entry, it avoids triggering
allocation of a new slab with M_NOVM until after the insertion is
complete. Instead, vm_map_insert() allocates from the reserve and sets
a flag in kernel_map to trigger re-population of the reserve just before
the map is unlocked.
I thought about a scheme for preallocating all of the KVA required for
kernel map entries during boot, like we do for radix nodes with
uma_zone_reserve_kva(). However, it's difficult to come up with a
reasonable upper bound for the number of kernel map entries that may be
required.