This change apparently sit in my local tree for long time.
I do not see why it is impossible for e.g. two threads to fault on the same address simultaneously, then one thread creates the mapping in pmap_enter() and promotes it, and only then the second thread enters pmap_enter(), either due to locking or locking and scheduling. In this case, amd64 pmap demotes large page first, while i386 panics for some contrast.
Handle the i386 pmap_enter() same as amd64, demote.