Page MenuHomeFreeBSD

ossl: Add support for AES-CBC cipher
ClosedPublic

Authored by kd on Sep 24 2021, 11:38 AM.
Tags
None
Referenced Files
F107196592: D32099.diff
Sat, Jan 11, 12:38 PM
Unknown Object (File)
Thu, Jan 2, 7:40 AM
Unknown Object (File)
Thu, Dec 19, 12:03 AM
Unknown Object (File)
Wed, Dec 18, 4:28 AM
Unknown Object (File)
Fri, Dec 13, 7:09 PM
Unknown Object (File)
Dec 1 2024, 3:34 PM
Unknown Object (File)
Nov 30 2024, 11:47 PM
Unknown Object (File)
Nov 30 2024, 5:42 AM
Subscribers

Details

Summary

Create a new abstraction layer for ciphers basing on CHACHA20 implementation.
Hook it up with the AES-CBC OpenSSL assembly code.
Contrary to the SHA and CHACHA20, AES OpenSSL assembly logic does not have a fallback implementation in case CPU doesn't support required instructions.
Because of that CPU caps are checked during initialization and AES support is advertised only if available.
The feature is available on all architectures that ossl supports: i386, amd64, arm64.

The biggest advantage of this patch over existing solutions (aesni(4) and armv8crypto(4)) is that it supports SHA, allowing for ETA operations.

Test Plan

functional:
cryptocheck -z -a aes-cbc -d ossl

benchmarks:

Intel i5-10500E:

cryptotest -d aesni -a aes 100000 64 128 256 512 1024 1480 65536
0.201 sec, 200000 aes crypts, 64 bytes, 63550381 byte/sec, 484.9 Mb/sec
0.209 sec, 200000 aes crypts, 128 bytes, 122628856 byte/sec, 935.6 Mb/sec
0.224 sec, 200000 aes crypts, 256 bytes, 228665345 byte/sec, 1744.6 Mb/sec
0.254 sec, 200000 aes crypts, 512 bytes, 403761607 byte/sec, 3080.5 Mb/sec
0.310 sec, 200000 aes crypts, 1024 bytes, 660545014 byte/sec, 5039.6 Mb/sec
7.964 sec, 200000 aes crypts, 65536 bytes, 1645868953 byte/sec, 12557.0 Mb/sec

cryptotest -d ossl -a aes 100000 64 128 256 512 1024 1480 65536
0.151 sec, 200000 aes crypts, 64 bytes, 84700339 byte/sec, 646.2 Mb/sec
0.158 sec, 200000 aes crypts, 128 bytes, 162515950 byte/sec, 1239.9 Mb/sec
0.173 sec, 200000 aes crypts, 256 bytes, 296011933 byte/sec, 2258.4 Mb/sec
0.202 sec, 200000 aes crypts, 512 bytes, 507566407 byte/sec, 3872.4 Mb/sec
0.258 sec, 200000 aes crypts, 1024 bytes, 795111308 byte/sec, 6066.2 Mb/sec
7.740 sec, 200000 aes crypts, 65536 bytes, 1693390748 byte/sec, 12919.5 Mb/sec

LS1028ARDB(armv8):

cryptotest -d armv8crypto -a aes 100000 64 128 256 512 1024 1480 65536
1.075 sec, 200000 aes crypts, 64 bytes, 11910877 byte/sec, 90.9 Mb/sec
1.117 sec, 200000 aes crypts, 128 bytes, 22917362 byte/sec, 174.8 Mb/sec
1.188 sec, 200000 aes crypts, 256 bytes, 43104138 byte/sec, 328.9 Mb/sec
1.324 sec, 200000 aes crypts, 512 bytes, 77341682 byte/sec, 590.1 Mb/sec
1.601 sec, 200000 aes crypts, 1024 bytes, 127910383 byte/sec, 975.9 Mb/sec
37.105 sec, 200000 aes crypts, 65536 bytes, 353250820 byte/sec, 2695.1 Mb/sec

cryptotest -d ossl -a aes 100000 64 128 256 512 1024 1480 65536
1.131 sec, 200000 aes crypts, 64 bytes, 11320071 byte/sec, 86.4 Mb/sec
1.157 sec, 200000 aes crypts, 128 bytes, 22122154 byte/sec, 168.8 Mb/sec
1.197 sec, 200000 aes crypts, 256 bytes, 42758098 byte/sec, 326.2 Mb/sec
1.301 sec, 200000 aes crypts, 512 bytes, 78708867 byte/sec, 600.5 Mb/sec
1.517 sec, 200000 aes crypts, 1024 bytes, 135034451 byte/sec, 1030.2 Mb/sec
25.914 sec, 200000 aes crypts, 65536 bytes, 505798066 byte/sec, 3858.9 Mb/sec

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kd requested review of this revision.Sep 24 2021, 11:38 AM
kd edited the test plan for this revision. (Show Details)
kd edited the test plan for this revision. (Show Details)

I would probably prefer to make ossl_cipher more abstract and instead having a 'process' hook, or perhaps separate 'encrypt' and 'decrypt' hooks that for Chacha20 would just call the existing ossl_chacha20() and provide suitable AES-CBC routines in ossl_aes.c. Chacha has additional worries about counter rollover to handle that AES-CBC does not for example and I think it's perhaps better to just keep them separate so they can be optimized for their specific use cases. For ossl_aes.c you could then avoid having function pointers for the arch-specific encrypt/decrypt hooks by using macros (aes_cbc_encrypt, etc.) . You could also use a function pointer in ossl_cipher for 'setkey' called from ossl_cipher_newsession and perhaps permit that function pointer to be NULL. I've even considered doing some similar changes in cryptosoft.c to split swcr_encdec into separate cases for CBC vs stream ciphers (and for stream ciphers supporting a multi-block interface).

sys/crypto/openssl/ossl.c
164

This check is kind of useless and wrong since you can't do a 136 bit key for AES. Other drivers instead explicitly check for the valid key sizes (128, 192, 256) in probesession. That is what this should do instead in the switch in the AES_CBC case. For example, aesni.c does:

	case CRYPTO_AES_CBC:
	case CRYPTO_AES_ICM:
		switch (csp->csp_cipher_klen * 8) {
		case 128:
		case 192:
		case 256:
			break;
		default:
			CRYPTDEB("invalid CBC/ICM key length");
			return (false);
		}
		if (csp->csp_ivlen != AES_BLOCK_LEN)
			return (false);
		break;

in aesni_cipher_supported().

291

You don't need this check here as probesession should fail the session instead before this.

395

Note that stream ciphers like AES-CTR don't use separate decrypt contexts, so this is somewhat CBC mode specific.

sys/crypto/openssl/ossl.h
37

The intention is to not expose these details in ossl.h but to keep the MD details private to ossl_<arch>.c. I would consider having ossl_cpuid() set sc->has_aes at the end perhaps allowing you to remove the need to expose these outside of the per-arch files.

62

Going forward the sizes might be different however. I would prefer to use separate ossl_hash_context and ossl_cipher_context structures and shrink ossl_cipher_context to be as small as needed for now.

  • Move AES CPU capabilities check to ossl_cpuid().
  • Use separate context for cipher.
  • Fix key length check.
  • Remove redundant check.
sys/modules/ossl/Makefile
54

You should be able to use .arch_extension aes in aesv8-armx.S to enable the aes extension.

kd marked 4 inline comments as done.Sep 27 2021, 10:06 AM

The idea behind ossl_cipher_process() is to have all the crypto_cursor logic in one place.
The 'process' routine gets a buffer whose size is a multiple of the block length, pre-prepared key and iv.
With this approach cipher specific code doesn't need to have any logic related to how the kernel crypto layer stores data.
Right now we pretty much have to duplicate the crypto_cursor bits.
I suppose that I could just move ossl_cipher_process() logic to ossl_aes.c and go from there.
We'd still need to expose set_key to ossl.c for performance reasons, but as you pointed out that shouldn't be a big problem.
Any thoughts?

sys/crypto/openssl/ossl.c
395

That's true. I suppose that I could change set_key to take bool encrypt as an argument.
I'd still prefer to have a separate encrypt and decrypt key contexts for performance reasons.
What do you think about that?

sys/modules/ossl/Makefile
54

The problem is that aesv8-armx.S is an auto-generated file taken from OpenSSL:

/* $FreeBSD$ */
/* Do not modify. This file is auto-generated from aesv8-armx.pl. */

I'd prefer to avoid modifying it since that would make merging new versions of it harder.

  • Check if set_*_key is NULL before calling it.
  • Modify ossl_cipher to be more higher level, move ossl_process_cipher to ossl_aes.c.
  • Call CHACHA20 code through the new abstraction layer.
  • Declare MD assembly routines in ossl_aarch64.h and ossl_x86.h files, use AES_CBC_ENCRYPT to expose them to ossl_aes.c
  • Add ossl to the aesmodules in cryptotest.py

Any update on this? I'd like to merge it in the middle of next week.

Do you have a followup change that adds support for CSP_MODE_ETA?

sys/crypto/openssl/ossl.c
166

Minor: I would just collapse these two switch statements together, so e.g.:

case CRYPTO_AES_CBC:
      switch (csp->csp_cipher_klen * 8) {
      case 128:
      case 192:
      case 256:
          break;
      default:
          return (NULL);
      }
      return (&ossl_cipher_aes_cbc);
This revision is now accepted and ready to land.Oct 29 2021, 10:15 PM
In D32099#738910, @jhb wrote:

Do you have a followup change that adds support for CSP_MODE_ETA?

Yes, D32100

This revision was automatically updated to reflect the committed changes.