Page MenuHomeFreeBSD

amd64: align memmove buffers to 16 bytes before using rep movs
ClosedPublic

Authored by mjg on Dec 1 2018, 1:10 PM.
Tags
None
Referenced Files
F103484355: D18401.diff
Mon, Nov 25, 3:14 PM
Unknown Object (File)
Tue, Nov 12, 2:52 AM
Unknown Object (File)
Tue, Nov 12, 1:36 AM
Unknown Object (File)
Oct 1 2024, 12:19 PM
Unknown Object (File)
Sep 24 2024, 4:35 PM
Unknown Object (File)
Sep 24 2024, 4:44 AM
Unknown Object (File)
Sep 24 2024, 4:14 AM
Unknown Object (File)
Sep 24 2024, 3:59 AM
Subscribers

Details

Summary

Sample result from EPYC:

(256-511)... memmove_kernel_align16: 1.232132844 seconds.
(512-1023)... memmove_kernel_align16: 4.041251906 seconds.
(1024-2047)... memmove_kernel_align16: 11.302151396 seconds.
(2048-4095)... memmove_kernel_align16: 17.781004266 seconds.
(256-511)... memmove_kernel: 1.224122136 seconds.
(512-1023)... memmove_kernel: 4.494771289 seconds.
(1024-2047)... memmove_kernel: 13.450829032 seconds.
(2048-4095)... memmove_kernel: 23.493978908 seconds.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 21295

Event Timeline

This aligns destination but does not align source. I would expect that the other side of movsb provides comparable improvement ?

sys/amd64/amd64/support.S
305

you can do testb

327

Is it a tab between instruction and register ?

This revision is now accepted and ready to land.Dec 1 2018, 1:35 PM
This revision was automatically updated to reflect the committed changes.
mjg marked an inline comment as done.Dec 1 2018, 2:22 PM

I did basic tests with changing the alignment of src and slowdowns were very small compared to similarly misaligned dst, at least on EPYC. I may take a closer look later.

sys/amd64/amd64/support.S
327

yes