Page MenuHomeFreeBSD

Use macros to search vm_map
ClosedPublic

Authored by dougm on Mon, Nov 25, 8:54 AM.

Details

Summary

Replace a few utility functions for vm_map manipulation with macros, and add a boolean status argument to give the compiler a chance to generate better code.

Test Plan

On 2 tests that exercise the vm_map vigoriously, these are the runtimes and dc-misses with and without the change in place:

After:
Test 1 Test 2
seconds dc-misses seconds dc-misses
17.518129 719615104 16.613336 694618946
17.544017 717358163 16.560833 695521981
17.498316 722645549 16.519886 693421798
17.481360 723478804 16.559717 689488954
17.505018 720791939 16.494966 689094378

Before:
Test 1 Test 2
seconds dc-misses seconds dc-misses
18.448940 697424367 17.487485 680852142
18.520613 700154192 17.488678 670644687
18.369913 696744015 17.553468 668572905
18.412856 704237817 17.457770 664390472
18.513684 700244077 17.480940 664748569

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

dougm created this revision.Mon, Nov 25, 8:54 AM
dougm added a subscriber: pho.Mon, Nov 25, 3:54 PM
dougm updated this revision to Diff 64854.Mon, Nov 25, 4:47 PM

Restore gap_end variable to fix findspace problem.

pho added a comment.Tue, Nov 26, 7:18 AM

I ran tests on D22544.64854 for 14 hours. No problems seen.

markj added a comment.Tue, Nov 26, 5:08 PM

Do you get similar results by using C functions with the __always_inline attribute instead of macros?

dougm updated this revision to Diff 64894.Tue, Nov 26, 5:49 PM
dougm edited the test plan for this revision. (Show Details)

Replace the macros with __always_inline functions, which preserves, and perhaps slightly enhances the performance benefit.

markj added inline comments.Wed, Nov 27, 4:26 PM
sys/vm/vm_map.c
1096 ↗(On Diff #64894)

Isn't *found true iff root != NULL? Why do we need found at all?

dougm updated this revision to Diff 64952.Wed, Nov 27, 5:43 PM
dougm marked an inline comment as done.

Stop trying to insert 'found' into the splay search code.

dougm added inline comments.Wed, Nov 27, 5:45 PM
sys/vm/vm_map.c
1096 ↗(On Diff #64894)

I thought I was saving a root != NULL test and letting the break cases in LEFT_STEP/RIGHT_STEP go directly to handling the root!=NULL cases without testing. But I guess I was wrong. After stripping out 'found', I get:

17.482829 719261050 16.607015 704332901
17.536821 744567159 16.530198 691722248
17.491278 745792388 16.500271 692419053
17.513297 731784319 16.567252 709416066
17.473170 721502357 16.543261 709815515

which looks about the same. So I'll take it out.

1102 ↗(On Diff #64952)

I hear you ask - why not just keep this a simple while loop? Well, because it can't a simple while loop in the threaded tree, and I seek to make the final patch that switches from not-threaded to threaded to be one that can be easily understood, if I ever get there.

dougm updated this revision to Diff 64953.Wed, Nov 27, 6:11 PM

Go ahead and make findnext and findprev look less weird.

markj accepted this revision.Wed, Nov 27, 6:16 PM
This revision is now accepted and ready to land.Wed, Nov 27, 6:16 PM
dougm added a subscriber: alc.Wed, Nov 27, 7:12 PM
alc added a comment.Wed, Nov 27, 7:24 PM

Can you explain why this change reduces the number of L1 data cache misses?

dougm added a comment.Wed, Nov 27, 8:48 PM
In D22544#493747, @alc wrote:

Can you explain why this change reduces the number of L1 data cache misses?

I haven't observed that. I didn't expect to impact L1 data cache misses.

alc added a comment.Wed, Nov 27, 8:57 PM
In D22544#493747, @alc wrote:

Can you explain why this change reduces the number of L1 data cache misses?

I haven't observed that. I didn't expect to impact L1 data cache misses.

I misread the results.

This revision was automatically updated to reflect the committed changes.