The computations of vm_map_splay_split and vm_map_splay_merge touch both children of every entry on the search path as part of updating values of the max_free field. By comparing the max_free values of an entry and its child on the search path, the code can avoid accessing the child off the path in cases where the max_free value decreases along the path.
Specifically, this patch changes splay_split so that the max_free field of every entry on the search path is replaced, temporarily, by the max_free field from its child not on the search path or, if the child in that direction is NULL, then a difference between start and end values of two pointers already available in the split code, without following any next or prev pointers. However, to find that max_free value does not require looking toward that other child if either the child on the search path has a lower max_free value, or the current max_free value is zero, because in either case we know that the value of max_free for the other child is the value we already have. So, the changes to vm_entry_splay_split make sure that we know all the off-search-path entries we will need to complete the splay, without looking at all of them. There is an exception at the bottom of the search path where we cannot rely on the max_free value in the direction of the NULL pointer that ends the search, because of the behavior of entry-clipping code.
The corresponding change to vm_splay_entry_merge makes it simpler, since it's just reversing pointers and updating running maxima.
In a test intended to exercise vigorously the vm_map implementation, the effect of this change was to reduce the data cache miss rate by 10-14% and the running time by 5-7%.
MFC: 3 months
I would paste the above in as commit message when testing is complete and the patch can be checked in. If any mentors want to suggest improvements to this checkin message, please do.
Tested by: pho