Page MenuHomeFreeBSD

amd64: align memmove buffers to 16 bytes before using rep movs
ClosedPublic

Authored by mjg on Sat, Dec 1, 1:10 PM.

Details

Summary

Sample result from EPYC:

(256-511)... memmove_kernel_align16: 1.232132844 seconds.
(512-1023)... memmove_kernel_align16: 4.041251906 seconds.
(1024-2047)... memmove_kernel_align16: 11.302151396 seconds.
(2048-4095)... memmove_kernel_align16: 17.781004266 seconds.
(256-511)... memmove_kernel: 1.224122136 seconds.
(512-1023)... memmove_kernel: 4.494771289 seconds.
(1024-2047)... memmove_kernel: 13.450829032 seconds.
(2048-4095)... memmove_kernel: 23.493978908 seconds.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

mjg created this revision.Sat, Dec 1, 1:10 PM
kib accepted this revision.Sat, Dec 1, 1:35 PM

This aligns destination but does not align source. I would expect that the other side of movsb provides comparable improvement ?

sys/amd64/amd64/support.S
305 ↗(On Diff #51468)

you can do testb

327 ↗(On Diff #51468)

Is it a tab between instruction and register ?

This revision is now accepted and ready to land.Sat, Dec 1, 1:35 PM
This revision was automatically updated to reflect the committed changes.
mjg marked an inline comment as done.Sat, Dec 1, 2:22 PM

I did basic tests with changing the alignment of src and slowdowns were very small compared to similarly misaligned dst, at least on EPYC. I may take a closer look later.

sys/amd64/amd64/support.S
327 ↗(On Diff #51468)

yes