The collapse scan may drop the object lock for sleep. Apparently, this allows for two races: one between two parallel collapse scans, and second between page fault handler and scan. The r291576 amplified the effect, making the issues reported often by real users.
Assume that the scan was running, and the object lock was dropped, and a parallel second collapse is started. What I see in live was a thread performing fork and calling collapse from copy_entry. We get the first race then. The winning collapse would bump reference count on on of the backing objects, triggering the panic 'backing_object was somehow re-referenced' in the other scan. To fix this, I increment the paging in progress counter for the top object, then the (slightly adapted) check on the start of the scan prevents parallel scans.
Similarly, other thread may fault or wire the entry served by the scanned object. Then, since waitable scan sets OBJ_DEAD flag, the fault handler returns EFAULT, causing spurious SIGSEGV in the processes. Fix is to wait until the scan finishes. Unfortunately, the situation is similar to "vmo_de" sleep, the backing object reference by the top of the shadow list is what prevents the fs.object from becoming invalid, and we must not reference it directly to not trigger panic in the scan. So I cannot sleep precisely for the scan to finish, instead I retry the fault handler after pause.
As usual, I have no good idea what to do with the wired invalid page found by either fault handler or scan. The release_page() and the assert in scan were modified to keep such pages alone.
This review also includes the fix for avg' reported access after free to map entry issue. It touches related area and I suspect that failpoints used would trigger it as well.
Also, at least three trivial bugs in the improbable failure of vm_radix_insert handlers were fixed.