Page MenuHomeFreeBSD

Use macros to search vm_map
ClosedPublic

Authored by dougm on Nov 25 2019, 8:54 AM.
Tags
None
Referenced Files
F103764983: D22544.id64847.diff
Fri, Nov 29, 3:23 AM
Unknown Object (File)
Mon, Nov 18, 1:18 AM
Unknown Object (File)
Sun, Nov 17, 11:09 PM
Unknown Object (File)
Fri, Nov 8, 3:28 PM
Unknown Object (File)
Fri, Nov 8, 8:12 AM
Unknown Object (File)
Fri, Nov 8, 6:08 AM
Unknown Object (File)
Fri, Nov 8, 4:31 AM
Unknown Object (File)
Oct 5 2024, 9:44 PM
Subscribers

Details

Summary

Replace a few utility functions for vm_map manipulation with macros, and add a boolean status argument to give the compiler a chance to generate better code.

Test Plan

On 2 tests that exercise the vm_map vigoriously, these are the runtimes and dc-misses with and without the change in place:

After:
Test 1 Test 2
seconds dc-misses seconds dc-misses
17.518129 719615104 16.613336 694618946
17.544017 717358163 16.560833 695521981
17.498316 722645549 16.519886 693421798
17.481360 723478804 16.559717 689488954
17.505018 720791939 16.494966 689094378

Before:
Test 1 Test 2
seconds dc-misses seconds dc-misses
18.448940 697424367 17.487485 680852142
18.520613 700154192 17.488678 670644687
18.369913 696744015 17.553468 668572905
18.412856 704237817 17.457770 664390472
18.513684 700244077 17.480940 664748569

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

Restore gap_end variable to fix findspace problem.

I ran tests on D22544.64854 for 14 hours. No problems seen.

Do you get similar results by using C functions with the __always_inline attribute instead of macros?

dougm edited the test plan for this revision. (Show Details)

Replace the macros with __always_inline functions, which preserves, and perhaps slightly enhances the performance benefit.

sys/vm/vm_map.c
1093

Isn't *found true iff root != NULL? Why do we need found at all?

dougm marked an inline comment as done.

Stop trying to insert 'found' into the splay search code.

sys/vm/vm_map.c
1093

I thought I was saving a root != NULL test and letting the break cases in LEFT_STEP/RIGHT_STEP go directly to handling the root!=NULL cases without testing. But I guess I was wrong. After stripping out 'found', I get:

17.482829 719261050 16.607015 704332901
17.536821 744567159 16.530198 691722248
17.491278 745792388 16.500271 692419053
17.513297 731784319 16.567252 709416066
17.473170 721502357 16.543261 709815515

which looks about the same. So I'll take it out.

1102

I hear you ask - why not just keep this a simple while loop? Well, because it can't a simple while loop in the threaded tree, and I seek to make the final patch that switches from not-threaded to threaded to be one that can be easily understood, if I ever get there.

Go ahead and make findnext and findprev look less weird.

This revision is now accepted and ready to land.Nov 27 2019, 6:16 PM

Can you explain why this change reduces the number of L1 data cache misses?

In D22544#493747, @alc wrote:

Can you explain why this change reduces the number of L1 data cache misses?

I haven't observed that. I didn't expect to impact L1 data cache misses.

In D22544#493747, @alc wrote:

Can you explain why this change reduces the number of L1 data cache misses?

I haven't observed that. I didn't expect to impact L1 data cache misses.

I misread the results.