On vm_page_rename failure, fix a missing object unlock and a double
free of a page.
First remove the old page, then rename into other page into
first_object, then free the old page. This avoids the problem on
rename failure. This is a little ugly but seems to be the most
straightforward solution.
This bug blocks further exploration of M_NOWAIT bugs with RADIX NODE
because of frequency (including D4146).
This is the patch currently applied internally at Isilon. A few other
solutions considered:
- Inline pieces of unlock_and_deallocate. This is efficient in terms of the various locks but seems fragile, if the recipe for retrying the fault ever changes.
- Patching unlock_and_deallocate. This seems okay in principle but would be easier to do if I understood vm_fault_hold better. I am nervous about e.g. putting an if (fs.first_m != NULL) guard there and possibly covering up problems at the other various call sites.
- Enhance vm_page_replace and call that instead of vm_page_rename. It's arguable if that might be a good idea, and it can be explored later.