Page MenuHomeFreeBSD

lib/libc/amd64/string/strcat.S: enable use of SIMD
ClosedPublic

Authored by fuz on Nov 14 2023, 7:00 PM.
Tags
None
Referenced Files
Unknown Object (File)
Apr 12 2024, 1:51 AM
Unknown Object (File)
Mar 31 2024, 8:26 AM
Unknown Object (File)
Mar 31 2024, 7:26 AM
Unknown Object (File)
Mar 18 2024, 9:28 PM
Unknown Object (File)
Mar 9 2024, 7:26 PM
Unknown Object (File)
Feb 20 2024, 5:42 PM
Unknown Object (File)
Feb 11 2024, 5:13 AM
Unknown Object (File)
Feb 3 2024, 1:12 AM
Subscribers

Details

Summary

strcat has a bespoke scalar assembly implementation we
inherited from NetBSD. While it performs well, it is
better to call into our SIMD implementations if any SIMD
features are available at all. So do that and implement
strcat() by calling into strlen() and strcpy() if these
are available.

Sponsored by: The FreeBSD Foundation

Test Plan

passes the strcat unit test. No new kyua
test suite failures.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

fuz requested review of this revision.Nov 14 2023, 7:00 PM
lib/libc/amd64/string/strcat.S
191 ↗(On Diff #130091)

This needs a comment explaining that the stack is properly aligned after the pushes.
That said, why not to keep args around in more straightforward way, by saving two callee-saved registers and keeping the args in them, instead of reading values from stack?

193 ↗(On Diff #130091)

CNAME() is not needed for ELF, and is not used in new code for long time

lib/libc/amd64/string/strcat.S
191 ↗(On Diff #130091)

This needs a comment explaining that the stack is properly aligned after the pushes.

Do I really need to point out the obvious? I can also add comments indicating which registers hold which arguments or just copy the amd64 SysV ABI supplement in its entirety into the code if that helps.

It's honestly a bit frustrating to randomly be asked to point out random bits of the calling convention and then randomly be told to remove such comments as what happens is obvious from the calling convention. This happened a bunch of times now and there seems to be no consistency in the requests.

Could you give me guidance as to what exactly I should comment for every assembly function and what I should not comment about? I would like to avoid having to repeat this discussion, so perhaps you could just tell me once and I'll do it in accordance to your wishes in the future.

That said, why not to keep args around in more straightforward way, by saving two callee-saved registers and keeping the args in them, instead of reading values from stack?

I do so for rbx where it makes sense, but for rsi it would actually make the code longer with no change in the number of loads and stores, except perhaps to remove the load latency from rsi and move it to the reload of the callee-saved register.

193 ↗(On Diff #130091)

I'm hearing this for the first time, having previously used CNAME multiple times throughout this project. Including in the archlevel dispatch framework, which you thoroughly reviewed. Why didn't you tell me earlier? It is very disheartening to hear that this macro I have used frequently and without questions from any reviewers is supposed to be obsolete even though there is no comment anywhere that it is and nobody ever said so; on the contrary, <machine/asm.h> reads like it is strongly suggested to use this macro.

If CNAME is no longer supposed to be used, you should add a deprecation notice into <machine/asm.h> and remove it from all code in FreeBSD. Note that other people added uses of CNAME or at least didn't remove them when they had the choice to in recent commits like f62da49b2f17,

However, I would prefer to leave the CNAME in for consistency and to make it easier to adapt the code to other platforms that may have symbol decoration (such as macOS). But if you wish so, I can of course remove it entirely.

This revision was not accepted when it landed; it landed in state Needs Review.Dec 25 2023, 2:26 PM
This revision was automatically updated to reflect the committed changes.