Change Details

* Registers TRNG source for random(4) * Finds available queues, LSBs; allocates static objects * Allocates a shared MSI-X for all queues. The hardware does not have separate interrupts per queue. Working interrupt mode driver. * Computes SHA hashes, HMAC. Passes cryptotest.py, cryptocheck tests. * Does AES-CBC, CTR mode, and XTS. cryptotest.py and cryptocheck pass. * Support for "authenc" (AES + HMAC). (SHA1 seems to result in "unaligned" cleartext inputs from cryptocheck -- which the engine cannot handle. SHA2 seems to work fine.) * GCM passes for block-multiple AAD, input lengths Largely based on ccr(4), part of cxgbe(4). Rough performance averages on AMD Ryzen 1950X (4kB buffer): aesni: SHA1: ~8300 Mb/s SHA256: ~8000 Mb/s ccp: ~630 Mb/s SHA256: ~660 Mb/s SHA512: ~700 Mb/s cryptosoft: ~1800 Mb/s SHA256: ~1800 Mb/s SHA512: ~2700 Mb/s As you can see, performance is poor in comparison to aesni(4) and even cryptosoft (due to high setup cost). At a larger buffer size (128kB), throughput is a little better (but still worse than aesni(4)): aesni: SHA1:~10400 Mb/s SHA256: ~9950 Mb/s ccp: ~2200 Mb/s SHA256: ~2600 Mb/s SHA512: ~3800 Mb/s cryptosoft: ~1750 Mb/s SHA256: ~1800 Mb/s SHA512: ~2700 Mb/s AES performance has a similar story: aesni: 4kB: ~11250 Mb/s 128kB: ~11250 Mb/s ccp: ~350 Mb/s 128kB: ~4600 Mb/s cryptosoft: ~1750 Mb/s 128kB: ~1700 Mb/s This driver is EXPERIMENTAL. You should verify cryptographic results on typical and corner case inputs from your application against a known- good implementation.