Change Details

Imagine this scenario: several threads or processes share a shadow object that maps a vnode. They all simultaneously fault for the same address that is currently busy. The first thread through will acquire the first object lock, allocate a busy page, follow the shadow chain unlocking the first object, discover the busy page, re lock the first_object, free the first_m, and wait. Each subsequent thread will repeat this with minor variations depending on how the operations are ordered by the first_object lock. If they race in before the free they will busy sleep on this page which will be immediately freed and they will immediately loop and repeat the process. Or they will be after the first thread's free and simply repeat the allocation process. If we instead leave the page allocated they will each skip the allocation and wait for the first thread through fault to validate the page if it is paging in or they will all reach the backing_object page and wait without each allocating and freeing a page. This also has the advantage of not requiring the object lock to be re-acquired just to free the page. If we're low on memory pageout is capable of freeing this page already. We're limiting the total number of additional invalid pages on the inactive queue to 2 * threads if all threads are stuck waiting in fault. I identified two places in vm_object that did not expect invalid pages on default objects. I _think_ this is all of them.