Details

Reviewers

kib
markj

Commits

rS355147: Inline some splay helper functions to improve performance on a

Summary

Replace a few utility functions for vm_map manipulation with macros, and add a boolean status argument to give the compiler a chance to generate better code.

Test Plan

On 2 tests that exercise the vm_map vigoriously, these are the runtimes and dc-misses with and without the change in place:

After:
Test 1 Test 2
seconds dc-misses seconds dc-misses
17.518129 719615104 16.613336 694618946
17.544017 717358163 16.560833 695521981
17.498316 722645549 16.519886 693421798
17.481360 723478804 16.559717 689488954
17.505018 720791939 16.494966 689094378

Before:
Test 1 Test 2
seconds dc-misses seconds dc-misses
18.448940 697424367 17.487485 680852142
18.520613 700154192 17.488678 670644687
18.369913 696744015 17.553468 668572905
18.412856 704237817 17.457770 664390472
18.513684 700244077 17.480940 664748569

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

dougm created this revision.Nov 25 2019, 8:54 AM

dougm added a subscriber: pho.Nov 25 2019, 3:54 PM

Restore gap_end variable to fix findspace problem.

I ran tests on D22544.64854 for 14 hours. No problems seen.

Do you get similar results by using C functions with the __always_inline attribute instead of macros?

Replace the macros with __always_inline functions, which preserves, and perhaps slightly enhances the performance benefit.

markj added inline comments.Nov 27 2019, 4:26 PM

sys/vm/vm_map.c
1096 ↗	(On Diff #64894)	Isn't `*found` true iff `root != NULL`? Why do we need `found` at all?

Stop trying to insert 'found' into the splay search code.

dougm added inline comments.Nov 27 2019, 5:45 PM

sys/vm/vm_map.c
1102 ↗	(On Diff #64952)	I hear you ask - why not just keep this a simple while loop? Well, because it can't a simple while loop in the threaded tree, and I seek to make the final patch that switches from not-threaded to threaded to be one that can be easily understood, if I ever get there.
1096 ↗	(On Diff #64894)	I thought I was saving a root != NULL test and letting the break cases in LEFT_STEP/RIGHT_STEP go directly to handling the root!=NULL cases without testing. But I guess I was wrong. After stripping out 'found', I get: 17.482829 719261050 16.607015 704332901 17.536821 744567159 16.530198 691722248 17.491278 745792388 16.500271 692419053 17.513297 731784319 16.567252 709416066 17.473170 721502357 16.543261 709815515 which looks about the same. So I'll take it out.

Go ahead and make findnext and findprev look less weird.

markj accepted this revision.Nov 27 2019, 6:16 PM

This revision is now accepted and ready to land.Nov 27 2019, 6:16 PM

dougm added a subscriber: alc.Nov 27 2019, 7:12 PM

Can you explain why this change reduces the number of L1 data cache misses?

In D22544#493747, @alc wrote:

Can you explain why this change reduces the number of L1 data cache misses?

I haven't observed that. I didn't expect to impact L1 data cache misses.

In D22544#493871, @dougm wrote:

In D22544#493747, @alc wrote:

Can you explain why this change reduces the number of L1 data cache misses?

I haven't observed that. I didn't expect to impact L1 data cache misses.

I misread the results.

Closed by commit rS355147: Inline some splay helper functions to improve performance on a (authored by dougm). · Explain WhyNov 27 2019, 9:00 PM

This revision was automatically updated to reflect the committed changes.

dougm added a commit: rS355147: Inline some splay helper functions to improve performance on a.

Herald added a subscriber: imp. · View Herald TranscriptNov 27 2019, 9:00 PM

Use macros to search vm_map
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 64979

head/sys/vm/vm_map.c

Use macros to search vm_mapClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 64979

head/sys/vm/vm_map.c

Use macros to search vm_map
ClosedPublic
Actions

Revision Contents
Changeset List