Page MenuHomeFreeBSD

Import Blake2 algorithms (blake2b, blake2s) from libb2
ClosedPublic

Authored by cem on Mar 12 2018, 2:42 AM.
Tags
None
Referenced Files
F108160030: D14662.diff
Wed, Jan 22, 12:58 AM
Unknown Object (File)
Fri, Jan 17, 6:36 PM
Unknown Object (File)
Thu, Jan 16, 7:18 PM
Unknown Object (File)
Thu, Jan 2, 12:36 AM
Unknown Object (File)
Wed, Jan 1, 11:33 PM
Unknown Object (File)
Dec 13 2024, 12:04 AM
Unknown Object (File)
Nov 17 2024, 12:15 AM
Unknown Object (File)
Nov 9 2024, 11:08 AM
Subscribers

Details

Summary

The upstream repository is on github BLAKE2/libb2. Files landed in
sys/contrib/libb2 are the unmodified upstream files.

sys/crypto/blake2 contains the source files needed to port libb2 to our
build system, a wrapped (limited) variant of the algorithm to match the API
of our auth_transform softcrypto abstraction, incorporation into the Open
Crypto Framework (OCF) cryptosoft(4) driver, as well as an x86 SSE/AVX
accelerated OCF driver, blake2(4).

Optimized variants of blake2 are compiled for a number of x86 machines
(anything from SSE2 to AVX + XOP). On those machines, FPU context will need
to be explicitly saved before using blake2(4)-provided algorithms directly.
Use via cryptodev / OCF saves FPU state automatically, and use via the
auth_transform softcrypto abstraction does not use FPU.

The intent of the OCF driver is mostly to enable testing in userspace via
/dev/crypto. ATF tests are added with published KAT test vectors to
validate correctness.

Test Plan

ATF tests included pass.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 15590
Build 15626: arc lint + arc unit

Event Timeline

conf/files: Also build in blake2 for ipsec | ipsec_support (forced by
cryptosoft.c dependency).

@Reviewers:

I'd suggest skipping over sys/contrib/libb2 entirely; it's uninteresting and just a direct import of files from libb2 from Github. You can verify 1:1 if you want. That's about 3/4 of the way down this page.

sys/crypto/blake2 has more interesting parts. Notable are blake2-sw.c -- this is the auth_hash generic transform wrapper around the non-accelerated C implementation -- and blake2_cryptodev.c -- this is the SSE/AVX accelerated implementation for x86 (implemented as a blake2 kernel module for now).

Lots more uninteresting files in this directory (still sys/crypto/blake2) -- standard header-alikes and blake2_kfreebsd.h are just enough userspace to get libb2 to build unmodified in the kernel. The various blake2<x>-<hw>.c files are also uninteresting -- just a way to rebuild blake2<x> several times with different -mfoo hardware flags enabled. config.h in this directory contains the only two defines this portion of libb2 actually cared about from configure.ac.

sys/modules and conf/files: Integrate with the build system.

sys/opencrypto: Minor changes needed in several places to support blake2 in cryptodev and cryptosoft.

Finally: tests/sys/opencrypto: tests.

Just displaying my ignorance: what is the use case for Blake2 in the kernel?

In D14662#308255, @jhb wrote:

Just displaying my ignorance: what is the use case for Blake2 in the kernel?

The motivation is to remove obstacles from a kernel implementation of Wireguard.

blake2 would also make a nice hash for ZFS

Fix build on powerpc.

It was the weird attempt to construct explicit_bzero that was causing linking
issues for ppc. For now, patch the otherwise pristine upstream libb2 sources
in this one location to fix ppc build. It is anticipated that this diff will
be dropped in the future as upstream moves to explicit_bzero preferentially:
https://github.com/BLAKE2/libb2/pull/14

sys/crypto/blake2/blake2_cryptodev.c
215

Isn't this check redundant and racy? We check sc->dying below with the sc lock held.

246

"Freed sessions are moved to the head of the queue, so ..."?

264

authini can't be NULL here.

291

(I'm not familiar with cryptodev.) How do we ensure that a different thread isn't holding a reference to this session in blake2_cipher_process()?

302

This looks a bit odd - shouldn't the cast be applied after the mask? I don't think it makes a difference in this case.

321

Style: mixed definitions and initializations.

385

Could be DEVMETHOD_END

sys/opencrypto/cryptosoft.c
967

Style: parens around return value.

tests/sys/opencrypto/blake2_test.c
54

This file is testing the running kernel, isn't it? If so, don't we in principle want to compile against the system headers rather than whatever happens to be in the src tree?

58

This should be available by the include path; can't we include <blake2.h>?

Thanks for taking a look!

sys/crypto/blake2/blake2_cryptodev.c
215

Yeah, it looks like it. This code comes almost directly from aesni(4). I'm not sure it's any better there.

246

Sure. Again, copied from aesni(4) :-)

264

Will fix.

291

Sessions don't have reference counts -- different threads are expected to open independent sessions.

302

If the compiler is smart enough to recognize the mask as a 32-bit value (and I expect it is) there should be no difference in the optimized code. Otherwise, yeah, it doesn't matter where the cast goes. This construct was just copied from aesni like everything else.

I think the plan going forward is to scrap the sid system and just use direct non-truncated pointers. JHB sent out a pretty thorough email a while back.

321

Yeah. Again, comes from aesni(4). Can fix.

385

Sure.

sys/opencrypto/cryptosoft.c
967

Prevailing style of the file is no parens.

tests/sys/opencrypto/blake2_test.c
54

Running kernel, yes. I don't follow why that means we should use the system headers, though. In general we build src things against the src tree rather than against the system headers / libraries.

Usually in development I will nextboot a test kernel to run my tests out of the src tree. I don't want to install the in-development headers into my main system all the time (or risk breaking it too much). Which is to say: at least for me, the running kernel is more likely to match the src tree headers than the system headers during testing.

And on a normally updated system or Jenkins build, the src tree and system headers will match. So I don't think it's a problem there.

58

I don't remember if there was a reason I didn't do that or not. I'll try that and report back.

sys/crypto/blake2/blake2_cryptodev.c
291

What I'm getting after is, can malicious threads cooperate to corrupt session state? I ask because I noticed that sessions don't have reference counts. :)

302

Ok

tests/sys/opencrypto/blake2_test.c
54

Hmm. I guess you're doing a standalone build of the tests then? If they were being built as part of a buildworld, I'd expect us to be pulling in cryptodev.h from the src tree even when including <crypto/cryptodev.h>.

Your workflow makes sense, but it looks like you're fighting the build system: you have to be careful to avoid mixing and matching system and src headers in a way that introduces some subtle incompatibility. The build system for tests kind of assumes that you're building and installing the tests together with world and kernel. It'd be nice if there was a target to build individual test suites directly against the src tree. Maybe "make check" is supposed to do that, but it was kind of half-baked as a concept and appears to be completely broken at the moment. :(

Anyway, I don't have any suggestions or objections here. It's just unfortunate that you're having to hack around to support what seems like a reasonable workflow. Most other tests in the tree aren't built this way, and I'm worried that the inconsistency is going to bite someone eventually. I could be completely misunderstanding something here though.

sys/crypto/blake2/blake2_cryptodev.c
291

Userspace threads? I don't think so. I think userspace session handles are file descriptors / files and the session won't be released until all fd references are gone. I might be mistaken, though.

tests/sys/opencrypto/blake2_test.c
54

I guess you're doing a standalone build of the tests then?

Yep.

The build system for tests kind of assumes that you're building and installing the tests together with world and kernel. It'd be nice if there was a target to build individual test suites directly against the src tree.

Yep. :-)

I think the build system should be enhanced eventually, but I don't have any concrete ideas for how to do that (or how the existing inconsistencies would cause issues).

cem marked 12 inline comments as done.

Incorporate Mark's feedback:

Remove racy/unneeded dying check.

Clean up free session comment.

Remove tautalogical authini conditional.

Remove initializations from definitions.

DEVMTHOD_END.

Angle brackets for blake2.h in the test program.

Anyone still looking but haven't submitted comments yet? I'm looking to shrink my git stack.

In D14662#310549, @cem wrote:

Anyone still looking but haven't submitted comments yet? I'm looking to shrink my git stack.

I don't have anything else. The change looks ok to me, but I'm not familiar enough with cryptodev to be a good reviewer.

This revision is now accepted and ready to land.Mar 21 2018, 3:29 PM
jhb added inline comments.
sys/crypto/blake2/blake2_cryptodev.c
215

No this check is just verifying that the list of algorithms in the session are something that this backend supports. That is only operated on the caller-supplied cri list, not the softc protected by the lock.

291

Yes, the session should hang around until the fd is released. That said, I feel that if sessions need ref counts, etc. that isn't something to be fixed in the driver, but something OCF should manage. It knows when it queues each request against a session, so there shouldn't be a need to for individual drivers to manage that.

sys/opencrypto/cryptosoft.c
1085

Arguably this change should be a separate commit so it can be MFC'd.

sys/crypto/blake2/blake2_cryptodev.c
215

John, Phabricator doesn't really make this clear but Mark's comment referred to an earlier version of the code which had an additional sc->dying check above the cri checks and outside of the sc->lock -- redundant with the check below.

291

There are a lot of ways OCF could be improved :-).

sys/opencrypto/cryptosoft.c
1085

Ok, I will do it separately.