Page MenuHomeFreeBSD

amd64: align memmove buffers to 16 bytes before using rep movs

Authored by mjg on Dec 1 2018, 1:10 PM.



Sample result from EPYC:

(256-511)... memmove_kernel_align16: 1.232132844 seconds.
(512-1023)... memmove_kernel_align16: 4.041251906 seconds.
(1024-2047)... memmove_kernel_align16: 11.302151396 seconds.
(2048-4095)... memmove_kernel_align16: 17.781004266 seconds.
(256-511)... memmove_kernel: 1.224122136 seconds.
(512-1023)... memmove_kernel: 4.494771289 seconds.
(1024-2047)... memmove_kernel: 13.450829032 seconds.
(2048-4095)... memmove_kernel: 23.493978908 seconds.

Diff Detail

rS FreeBSD src repository
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

mjg created this revision.Dec 1 2018, 1:10 PM
kib accepted this revision.Dec 1 2018, 1:35 PM

This aligns destination but does not align source. I would expect that the other side of movsb provides comparable improvement ?

305 ↗(On Diff #51468)

you can do testb

327 ↗(On Diff #51468)

Is it a tab between instruction and register ?

This revision is now accepted and ready to land.Dec 1 2018, 1:35 PM
This revision was automatically updated to reflect the committed changes.
mjg marked an inline comment as done.Dec 1 2018, 2:22 PM

I did basic tests with changing the alignment of src and slowdowns were very small compared to similarly misaligned dst, at least on EPYC. I may take a closer look later.

327 ↗(On Diff #51468)