The popular splay implementation adapted for use in vm_map.c can be made to do fewer comparisons and null checks, at the cost of a bit more code, by recognizing that when the next two steps are in opposite directions, the current and next nodes can be attached to left and right subtrees being assembled, rather than simply attaching the current and re-iterating on the next. This also means that a search for a missing element between the root and its predecessor or successor will not restructure the tree.
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
sys/vm/vm_map.c | ||
---|---|---|
928 | This can also be a redundant comparison if addr < y->start but y->left == NULL. |
sys/vm/vm_map.c | ||
---|---|---|
928 | This would get rid of your redundant comparison. Do you prefer it? if (addr < y->start) { if (y->left != NULL) { /* Rotate right and put y on rlist. */ root->left = y->right; y->right = root; vm_map_entry_set_max_free(root); root = y->left; y->left = rlist; rlist = y; continue; } } else if (addr >= y->end) { if (y->right == NULL) break; /* Put root on rlist, y on llist. */ root->left = rlist; rlist = root; root = y->right; y->right = llist; llist = y; continue; } /* Put root on rlist. */ root->left = rlist; rlist = root; root = y; break; |
I've tested the patch with compiler.compiler from SPECjvm2008. I'm afraid that there is no discernible difference in the number of cycles spent in vm_map_entry_splay().
x before + after +-------------------------------------------------------------------------------+ | + | |+ + x + x +x xx x +x x x + + +| | |___________|__________AM_A_____M__|________________| | +-------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 9 20905598 23828774 22610639 22557854 994238.6 + 9 19257738 25797772 23263976 22777334 2164232.2 No difference proven at 95.0% confidence
I see an 18.15% reduction in cycles spent in vm_map_entry_splay() for buildworld.
x before_world + after_world +------------------------------------------------------------------------------+ |+ x| |+ x| |+ x| |++ xx| |A| |A| +------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 5 4.8476252e+10 4.8601982e+10 4.8593236e+10 4.8570385e+10 53214127 + 5 3.9698544e+10 3.983885e+10 3.9745925e+10 3.9755596e+10 51061053 Difference at 95.0% confidence -8.81479e+09 +/- 7.60559e+07 -18.1485% +/- 0.156589% (Student's t, pooled s = 5.21487e+07)
A problem with using any SPECjvm2008 test for evaluating this patch is that by default the SPECjvm2008 tests run for a fixed amount of time. Essentially, speeding up vm_map_entry_splay() just means that there will be more calls to vm_map_entry_splay(), and so it's not entirely surprisingly that there is no statistical difference in the number of cycles spent in vm_map_entry_splay().
Here are results from a different Java application.
x javamark_1.5GB_before + javamark_1.5GB_after +--------------------------------------------------------------------------------------------------+ |+ + + + x +xx x x x +| | |____________________________________A_M_______________________|__________|____M______|| +--------------------------------------------------------------------------------------------------+ N Min Max Median Avg Stddev x 6 7180627 7524006 7463407 7416031.3 121934.89 + 6 6513706 7541927 7042828 7020509.2 391280.84 Difference at 95.0% confidence -395522 +/- 372781 -5.33334% +/- 5.02669% (Student's t, pooled s = 289801)
On a failed lookup, stop trying to avoid pulling both neighbors of the missing item to the top of the tree.
Save the new 'root' value in a variable 'z' to avoid having to access the same child of y twice, once for a null check and later for assignment to root.
sys/vm/vm_map.c | ||
---|---|---|
932 | This case, and the symmetric 'rotate left' case below, are the only cases in this splay code in which a node is modified, but not put on either rlist or llist. We know that max_free is calculated for those nodes in the second pass up the tree. Do we need to recompute max_free for this case, for 'root'? Yes. Suppose that at the top of the loop 'd' was root in a tree like this: We know that no changes are happening to the subtrees C and E the rest of the splay, as the next root value is b and further changes happen in only that subtree. |
sys/vm/vm_map.c | ||
---|---|---|
932 | Change 'b' in the last line to 'a'. |