Page MenuHomeFreeBSD

random: Add CCP random source
ClosedPublic

Authored by cem on Jan 15 2018, 11:35 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Jan 9, 1:27 PM
Unknown Object (File)
Wed, Dec 25, 7:07 PM
Unknown Object (File)
Nov 20 2024, 9:01 AM
Unknown Object (File)
Sep 28 2024, 8:37 PM
Unknown Object (File)
Sep 28 2024, 7:07 PM
Unknown Object (File)
Sep 18 2024, 1:09 PM
Unknown Object (File)
Sep 18 2024, 7:17 AM
Unknown Object (File)
Sep 17 2024, 4:55 PM
Subscribers

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

This is fine (obviously missing the actual implementation). Adding Dean to the reviewers, he has history in doing assessments of HW TRNG and might be a good collaborator to look at the quality of the bits coming from ccp(4).

Sorry, Removed my last comment as I wrote it thinking it we were talking about ppp link compression as an entropy source then saw the link to the AMD generator. I’m happy to help with measurements or anything else I can there as time allows. So, like Gordon I’m interested in seeing an implementation but think this bit adding it to the source list is fine.

The implementation is in the linked review in the description. We read a 32 bit register (TRNG_OUT) and get a new value every time. Here's some documentation from the device manual on how those bits are created and treated:

§ 7.8 RNG
The Random Number Generator (RNG) block is used to generate high quality random numbers using a hardware entropy source (ring oscillators). Samples from these ring oscillators (ROs) are conditioned with AES-CBC and fed into a CTR_DRBG as specified by NIST SP 800-90A. The output of this block is stored in a FIFO which is read by software and engines when they need random values.
...

§ 7.8.1 Entropy
The RNG uses 16 RO chains as a noise source. These ROs are free-running and the natural clock jitter from the chains is used to create entropy. Each RO chain has a prime number of inverters and is different from each other chain to prevent synchronization. Once enabled, the RO outputs are sampled every clock cycle.

§ 7.8.2 Conditioning
In the RNG design, each raw noise source output is estimated to have a min-entropy of 0.5 bits. Therefore, each 16 bit sample is estimated to have only 8 bits of entropy. This estimate is used to allow for natural bias that can occur due to manufacturing variation. A min-entropy of 0.5 ensures that the ROs may be up to 70% biased while still providing the level of entropy required for proper RNG operation.
Entropy is conditioned using the AES CBC-MAC construction specified in NIST SP 800-90B. Using this construction, 512 bits of RO samples are accumulated and then fed through the AES engine to create 128 bits of conditioned entropy. This output is seed-quality entropy as defined in the NIST publication.
This process of sampling data and conditioning it is repeated until a full 384-bit seed is constructed. After the initial seed is created, the accumulator continues to run by XOR’ing new RO samples into its holding register and conditioning entropy until needed.

§ 7.8.3 Health Checks
Per NIST SP 800-90B, the hardware implements the Repetition Count and Adaptive Proportion health checks. If enabled, these health checks run continuously and check that the min-entropy of the RO samples does not fall below the required threshold of 0.5 bits per sample.
The Repetition Count check looks for repeated 16-bit values. If the sample value is encountered more than the allowed cutoff, an error is fired.
The Adaptive Proportion tgest looks for the same 16-bit sample occurring multiple times within a fixed window (4096 samples). If the sample occurs more than the allowed cutoff, this test fires an error.
If either health check fires, an optional interrupt can be generated to the private host. In addition, the RNG will stop producing random values and return 0 to register reads. The BadEntropy bit in TRNG_CTL will be set indicating an entropy error. Software may then take corrective action, and if it desires may re-initialize the RNG by writing all 0’s to TRNG_CTL and then following the initialization procedure.

§ 7.8.4 DRBG
The RNG uses a Deterministic Random Bit Generator built around the counter mode of AES-256, as specified by NIST SP 800-90A. The DRBG takes a 384-bit seed from the entropy accumulator and creates 32-bit random numbers were are then stored in the FIFO for use by other blocks. The DRBG uses the avalanche effects of AES to quickly create high quality, statistically random values from the entropy.
After initialization, the DRBG will continue to generate random numbers and occasionally re-seed itself. The DRBG will generate no more than 2048 32-bit values on a given seed, and will aggressively try to re-seed if it is idle.

...

§ 7.8.6 Additional Features
The RNG supports several additional features and output registers to assist software. The RNG has 3 read-only, output registers:

  1. TRNG_OUT – Contains a 32-bit random value (output of DRBG)
  2. TRNG_SEED – Contains 32 bits of conditioned entropy (output of AES-CBC)
  3. TRNG_RAW – Contains 16 bits of raw ring oscaillator samples

Each of these registers changes its value after a read. The values read from TRNG_SEED/TRNG_RAW are never used in the RNG computation. The TRNG_SEED value may be used for software that requires full entropy seeds. The TRNG_RAW value may be used to perform more advanced health checks in software on the noise source.
The RNG also supports a test mode of operation which can be enabled in TRNG_CTL. In this mode, the ring oscillators are not used to generate entropy and instead a fixed counter is used. This allows for deterministic operation, and may be used by software to test the functionality of the RNG block via known-value testing.

This revision was not accepted when it landed; it landed in state Needs Review.Jan 16 2018, 2:56 AM
This revision was automatically updated to reflect the committed changes.

Conrad, thanks for the details. I also looked at the code in the other review and it looks good. I’d expect whitened output from the ctr-aes drbg to measure ~6.5 bits when put through the sp800-90b tool. That’s roughly what you get out of 1000000 samples from RDRND on Intel.

I only have access to Intel processors myself (may ask some co-workers if they have AMD gear I can borrow) so I can’t sample myself. Some dtrace probes and whatnot for pulling pre-fortuna entropy are on my github. I can walk through with anyone interested. Since this makes exposing non-whitened raw samples from the oscillators easily accessible without affecting state, it may be worth while to look into adding additional health checks for those who may want it in higher assurance environments, a-la the fips rngtest code we already have in the tree.

I will take a deeper look tomorrow when I am somewhere I’m not limited to typing with my thumbs.

Conrad, thanks for the details. I also looked at the code in the other review and it looks good. I’d expect whitened output from the ctr-aes drbg to measure ~6.5 bits when put through the sp800-90b tool. That’s roughly what you get out of 1000000 samples from RDRND on Intel.

FWIW, these processors also have RDRAND. I don't know if the RDRAND implementation is related to the CCP device TRNG or not.

I obtained some sample output from the CTR-AES DRBG via kgdb and /dev/mem:

$ kgdb
(kgdb) p/x g_ccp_softc.pci_bus_handle
$3 = 0xfffff800efc00000

// 0xC is the offset of the TRNG_OUT register:

(kgdb) p/x *(uint32_t*)(g_ccp_softc.pci_bus_handle + 0xc)
$4 = 0xa5ea1288
(kgdb) p/x *(uint32_t*)(g_ccp_softc.pci_bus_handle + 0xc)
$5 = 0xc018397b
...
$ python3 -c 'import os; \
f = os.open("/dev/mem", os.O_RDONLY); \
o = open("AMD_TRNG_8bit.bin", "wb"); \
o.write(b"".join([os.pread(f, 4, 0xefc0000c) for x in range(1000000//4)])); \
o.close()'

That sample is available here: https://people.freebsd.org/~cem/AMD_TRNG_8bit.bin

How long do these tests take to run? I've kicked off the iid_main on the provided 8-bit sample input ( https://github.com/usnistgov/SP800-90B_EntropyAssessment ) and it is taking "a long time." I'll leave it running overnight, along with tests on the AMD TRNG output.

I only have access to Intel processors myself (may ask some co-workers if they have AMD gear I can borrow) so I can’t sample myself.

FYI, this device may only be present in the latest generation AMD hardware (Zen).

Some dtrace probes and whatnot for pulling pre-fortuna entropy are on my github. I can walk through with anyone interested. Since this makes exposing non-whitened raw samples from the oscillators easily accessible without affecting state, it may be worth while to look into adding additional health checks for those who may want it in higher assurance environments, a-la the fips rngtest code we already have in the tree.

Certainly if someone else wants to do higher assurance validation, that is fine. I do not plan to :-).

I will take a deeper look tomorrow when I am somewhere I’m not limited to typing with my thumbs.

Thanks, much appreciated.

The noniid test has completed:

$ pypy noniid_main.py AMD_TRNG_8bit.bin 8
reading 1000000 bytes of data
-----------------------
min-entropy = 5.77879

Don't forget to run the sanity check on a restart dataset using H_I = 5.77879
In D13925#292062, @cem wrote:

Conrad, thanks for the details. I also looked at the code in the other review and it looks good. I’d expect whitened output from the ctr-aes drbg to measure ~6.5 bits when put through the sp800-90b tool. That’s roughly what you get out of 1000000 samples from RDRND on Intel.

FWIW, these processors also have RDRAND. I don't know if the RDRAND implementation is related to the CCP device TRNG or not.

I obtained some sample output from the CTR-AES DRBG via kgdb and /dev/mem:

What's the output when you grab the TRNG_RAW? Would that be an interesting assessment as it would get the DRBG input and assess the quality of that entropy?

What's the output when you grab the TRNG_RAW? Would that be an interesting assessment as it would get the DRBG input and assess the quality of that entropy?

Sure. I've grabbed both RAW and SEED inputs as they might both be interesting. SEED is a 32-bit register at 0x28; RAW is a 16-bit register at 0x2c. Due to limitations of /dev/mem I had to grab RAW as a 32-bit value and truncate:

python3 -c 'import os; f = os.open("/dev/mem", os.O_RDONLY); o = open("AMD_TRNG_RAW_8bit.bin", "wb"); o.write(b"".join([os.pread(f, 4, 0xefc0002c)[:2] for x in range(1000000//2)])); o.close()'

Both samples are here if someone wants to run analysis on them:

https://people.freebsd.org/~cem/AMD_TRNG_RAW_8bit.bin
https://people.freebsd.org/~cem/AMD_TRNG_SEED_8bit.bin

The iid analysis came back on the NIST-provided random sample:

IID = True
min-entropy = 7.86511

Don't forget to run the sanity check on a restart dataset using H_I = 7.86511

Still waiting on iid on the conditioned output, iid+noniid on RAW and SEED samples.

iid on conditioned output samples:

$ pypy iid_main.py AMD_TRNG_8bit.bin 8
reading 1000000 bytes of data



IID = True
min-entropy = 7.88724

Don't forget to run the sanity check on a restart dataset using H_I = 7.88724

noniid on RAW and SEED samples:

$ pypy noniid_main.py ~/AMD_TRNG_RAW_8bit.bin 8
reading 1000000 bytes of data
-----------------------
min-entropy = 5.75839

Don't forget to run the sanity check on a restart dataset using H_I = 5.75839

$ pypy noniid_main.py ~/AMD_TRNG_SEED_8bit.bin 8
reading 1000000 bytes of data
-----------------------
min-entropy = 5.76964

Don't forget to run the sanity check on a restart dataset using H_I = 5.76964

So, anything >4 bit (50%) would get you past the measurement part of an entropy analysis. Over 6 and IAD may think you did it wrong. These chips are tricky because they whiten. But unless you are using it as a sole source in your system, it isn’t really an issue with the whitening.

The total entropy fed into fortuna in randomdev_hash_iterate is the important bit and would be non-iid because of the interdependencies between environmental entropy sources. Since the output of /dev/random is fed as input into your fips DRBG in OpenSSL, for instance, then that’s why you take that measurement.

Because the rng in this chip is already a 90a drbg, it has already been assessed. You could just use and pull directly. (I suspect linux exposes it as /dev/hwrng like they do for RDRND instead of mixing it into /dev/random and making you use rng-tools to mix from userland).

Those numbers look about correct though.

iid test on RAW and SEED:

$ pypy iid_main.py ~/AMD_TRNG_SEED_8bit.bin 8
reading 1000000 bytes of data

IID = True
min-entropy = 7.86443

Don't forget to run the sanity check on a restart dataset using H_I = 7.86443

$ pypy iid_main.py ~/AMD_TRNG_RAW_8bit.bin 8
reading 1000000 bytes of data

IID = True
min-entropy = 7.80532

Don't forget to run the sanity check on a restart dataset using H_I = 7.80532

I'm not super interested in userspace RNG, but it would be easy to expose a char device (or even easier to expose a sysctl) if someone wants that. The documentation for the device claims that data pulled via register read of the RAW and SEED registers is not re-used in subsequent stages.