Page MenuHomeFreeBSD

lib/libmd: reimplement and enhance md5
Needs ReviewPublic

Authored by fuz on Jun 21 2024, 10:27 AM.
Tags
None
Referenced Files
F100686920: D45670.diff
Fri, Oct 18, 9:16 AM
F100665291: D45670.id140065.diff
Fri, Oct 18, 2:09 AM
F100658031: D45670.id.diff
Thu, Oct 17, 11:29 PM
Unknown Object (File)
Wed, Oct 16, 6:47 AM
Unknown Object (File)
Tue, Oct 15, 6:01 PM
Unknown Object (File)
Tue, Oct 15, 6:01 PM
Unknown Object (File)
Mon, Oct 14, 12:10 PM
Unknown Object (File)
Sat, Oct 12, 7:18 PM
Subscribers

Details

Summary

The reimplementation is a bit cleaner than the original code,
although it is also slightly slower. This shouldn't matter too
much as we have asm code for the major platforms.

Optimised implementations are provided for amd64 and aarch64.
For amd64, we have three implementations. One for baseline,
one using ANDN from BMI1 and one using AVX-512 (though it's not
really vectorised). Here's the performance:

11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz:

pre	17.0s (602 MB/s)
generic	18.8s (545 MB/s)
scalar	13.4s (764 MB/s)
bmi1	12.0s (853 MB/s)
avx512	10.6s (966 MB/s)

ARM Cortex-X1C (Windows 2023 Dev Kit perf core):

pre	35.2s (291 MB/s)
generic	36.4s (281 MB/s)
scalar	34.5s (297 MB/s)

ARM Cortex-A78C (Windows 2023 Dev Kit efficiency core):

pre	46.8s (219 MB/s)
generic	47.3s (216 MB/s)
scalar	44.5s (230 MB/s)

This changeset will have to be reworked when D34497 lands.
I'm not sure how to apply the SIMD code to all uses of MD5.

This changeset anticipates D34498 and no longer provides the
transform and block symbols.

Obtained from: https://github.com/animetosho/md5-optimisation/

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 59973
Build 56858: arc lint + arc unit

Event Timeline

fuz requested review of this revision.Jun 21 2024, 10:27 AM
lwhsu added inline comments.
lib/libmd/aarch64/md5block.S
4 ↗(On Diff #140065)

The order is SPDX then owner from style(9).

fuz planned changes to this revision.Tue, Oct 1, 3:29 PM

Needs to be refactored after the libmd rework.

  • lib/libmd: add include makefile for md5
  • lib/libmd: reimplement md5
  • lib/libmd: add scalar amd64 asm implementation for md5
  • lib/libmd/amd64: add BMI1 implementation for amd64
  • lib/libmd/amd64/md5block.S: add AVX-512 implementation
  • lib/libmd/aarch64: add md5 scalar implementation

This does things the proper way: a new include Makefile is added
so modules that need a copy of the md5 code (libc, libsa) can grab
it easily. The new md5 code is also used in the kernel (except for
the amd64 avx-512 path, as it's too expensive to turn on the FPU).
Internal symbols are decorated with _libmd_ to avoid collisions.

Can you move md5c.c in a separate change? It makes it easier to review only the changes to the file (if any)

sys/crypto/md5/md5block_aarch64.S
12

Can you move the macros before ENTRY? It makes it easier to find where the start of the function is if you don't have to skip over them.

176–177

Why are these commented out?

Can you move md5c.c in a separate change? It makes it easier to review only the changes to the file (if any)

Hi Andy,

The file md5c.c has been rewritten entirely. A diff can be produced, but is pretty much useless. What stays the same is the weak aliases on the bottom. You can find a the desired diff on Github.

sys/crypto/md5/md5block_aarch64.S
12

I have placed the macros after ENTRY as they “declare” local variables of this function. I can move the ENTRY macro around, but I believe it is cleared to have the declarations of local variables after the ENTRY macro, by analogy to how C functions are laid out.

176–177

They have been outlined and specialised in the two preceding blocks. I placed the commented out code there to make the correspondence clear; I should perhaps add a comment to make this clearer.

  • sys/crypto/md5/md5c.c: zero out context in MD5Final()
  • sys/crypto/md5/md5block_aarch64.S: improve comments, move ENTRY()
  • lib/libmd/Makefile.md5.inc: only apply -DMD5_ASM to md5c.c