This needs to go in before my collapse rewrite. The title may be slightly misleading in retrospect. This fixes the three-level shadow chain problem I described in my december emails "fault cow race" that becomes more pronounced after I weaken the protection of pip.
Stated as simply as I can, fault can copy a page from an arbitrarily deep level of nesting in the backing object chain while the collapse scan only prevents collapses of immediately adjacent objects in the chain. Before this patch, we simply look at the page count and page validity to determine if an object completely shadows its backing_object. Because we atomically swap the backing page and first_object page in pmap_enter() the backing_object page exists in pmap after the shadow check could return true.
Today this bug only triggers with deep chains. With my collapse patch it is possible with only a pair of objects because pip doesn't stop us from scanning the shadow chain.
To address this, I prevent scan_all_shadowed from returning true if the page in the backing_object is xbusy and I always do the full scan. Then, I added another page pointer to the fault state. This page is left xbusy until fault completes or is restarted. In this way we hold the original backing object page until the cow replacement is complete.
Long term I would like to make a shared busy lock type that allows us to do concurrent faults while preventing pageouts. This can't have the same semantics as shared busy which still permits page invalidation. For now I don't think the extra serialization and scans are particularly problematic. My entire object locking branch completes a parallel buildkernel on a 100 core machine in 1/10th the time of current.