Speed up vm_page_array initialization.
ClosedPublic
Actions

Authored by markj on Sep 6 2017, 6:29 PM.

Details

Reviewers

alc
kib

Commits

rS323290: Speed up vm_page_array initialization.

Summary

Collapse 3 loops over the vm_page array into 1 following a suggestion
from alc@. Testing on EC2 and a few desktop CPUs (Intel and AMD) shows
a significant improvement in initialization time.

One exception is pig1 in the netperf cluster, which contains Westmere-EX
CPUs. This patch makes vm_page_array initialization about 10% slower
there. I still haven't figured out how to address that; some
experimentation showed that zeroing vm_pages in a stride less than the
L1 cache size gives a good improvement, but this depends on the
alignment of the strides: my initial patch didn't align them, but when I
aligned them to the cache line size, the initialization slowed down
again.

I managed to reproduce what I think is the same problem using a userland
program, but the issue is difficult to analyze since many of the PMCs
I've tried to use don't appear to work (pmcstat returns EINVAL upon
attempting to enable them). I haven't yet dug into this further.

However, since the patch is straightforward and gives a good
improvement for cperciva's case, I'd like to propose its inclusion now.

Diff Detail

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 11417
Build 11776: arc lint + arc unit

Event Timeline

markj created this revision.Sep 6 2017, 6:29 PM

Harbormaster completed remote builds in B11399: Diff 32727.Sep 6 2017, 6:29 PM

markj added reviewers: alc, kib.Sep 6 2017, 6:33 PM

markj added subscribers: mjg, cperciva.

alc added inline comments.Sep 7 2017, 3:51 PM

sys/vm/vm_page.c
625	For stylistic consistency, we should probably have a blank line here, i.e., before the #if/block comment.
668	I would suggest adding a comment explaining our assumptions about phys_avail[]. Essentially, what you wrote in an earlier email.