random: Add NIST SP 800-90B entropy source health test implementations
ClosedPublic
Actions

Authored by markj on Jul 3 2025, 5:39 PM.

Details

Reviewers

cem

Group Reviewers

csprng

Commits

rGf92ff79720fb: random: Add NIST SP 800-90B entropy source health test implementations

Summary

This patch implements the noise source health tests described in chapter
four of NIST SP 800-90B[1]. The repetition count test and adaptive
proportion test both help identify cases where a noise source is stuck
and generating the same output too frequently. The tests are disabled
by default, but making an implementation available may help implementors
conform to FIPS validation requirements. This implementation aims to
comply with the requirements listed in section 4.3 of the document.

To enable health testing, set the kern.random.nist_healthtest_enabled
tunable to 1. Startup testing is implemented as specified in the
document: the first 1024 samples from a source are evaluated according
to the two tests, and they are discarded. The RANDOM_CACHED and
RANDOM_PURE_VMGENID sources are excluded from testing, as they are
effectively a one-time source of entropy, and statistical testing
doesn't seem to provide much use.

Since the first 1024 samples from entropy sources are discarded by the
implementation, it is possible that we might end up with insufficient
entropy during early boot if no boot-time entropy source (i.e.,
/entropy) is provided. If this is a problem, it could be remediated by
modifying the implementation to poll applicable sources (e.g., RDRAND)
to complete startup testing quickly, rather than relying on the random
kthread.

The entry point for the tests is random_harvest_healthtest(), intended
to be called from individual CSPRNG implementations in order to leverage
their locking context, e.g., the entropy pool lock in Fortuna. The
Fortuna implementation is modified to call this entry point, mainly to
demonstrate how the health tests can be integrated.

The tests operate on the entropy buffer plus the embedded timestamp,
treating them as a single value. We could alternately apply the tests
to the buffer and timestamp separately.

The main parameters for the tests themselves are H, the expected
min-entropy of samples, and alpha, the desired false positive error
rate. This implementation selects H=1 and alpha=2^{-34}; since each
sample includes a CPU cycle counter value, it seems reasonable to expect
at least one bit of entropy from among the low bits of the
high-frequency counter present on systems where FreeBSD is commonly
deployed, and the false positive rate was somewhat arbitrarily selected;
for more details see the comment in random_healthtest_init().

When a health test fails, a message is printed to the console the source
is disabled. On-demand testing is also supported via the
kern.random.nist_healthtest_ondemand sysctl. This can be used be an
administrator to re-enable a disabled source, following the same startup
testing mentioned above.

[1] https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-90B.pdf

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

markj created this revision.Jul 3 2025, 5:39 PM

Herald added a subscriber: imp. · View Herald TranscriptJul 3 2025, 5:39 PM

markj requested review of this revision.Jul 3 2025, 5:39 PM

Harbormaster completed remote builds in B65196: Diff 157899.Jul 3 2025, 5:39 PM

Herald added a subscriber: obrien. · View Herald TranscriptJul 3 2025, 5:39 PM

markj added a child revision: D51155: random: Treat writes to /dev/random as separate from /entropy.Jul 3 2025, 5:39 PM

Commit message nit:

When a health test fails, a message is printed to the console AND the source
is disabled.

I think?

sys/dev/random/random_harvestq.c
332	I think HARVESTSIZE+1 isn't doing much here without the `static` keyword. I might be mistaken.
391–399	Haven't read the NIST doc but so far I'm not sure I understand the difference between the apt/rct tests.
435	Do we continue to evaluate this on all samples in the steady state (after initial testing has passed) if so configured? I read the commit message as indicating we stopped testing after initial samples, but I might have misread.
515–516	We should probably do some range validation on source.
sys/dev/random/random_harvestq.h
52	Can this take a `const` havest_event ptr?

In D51154#1167575, @cem wrote:

Thank you for taking a look!

Commit message nit:

When a health test fails, a message is printed to the console AND the source
is disabled.

I think?

You're right.

sys/dev/random/random_harvestq.c
391–399	The RCT test checks for N consecutive samples with the same value. The APT test checks whether, in a window of W consecutive samples, the first value in the window appears "too many" times. These tests are quite weak and don't seem to do much more than detect catastrophic failure of the entropy source. A source which just outputs two alternating values will not be detected by these tests, for instance.
435	Yeah, the testing is meant to be continuous. The difference is in startup mode, i.e., during the first 1024 samples, we discard the samples after the test. After that, the tests still run on all samples, but the samples aren't discarded unless a test fails. I'll tweak the description to make that clearer.
sys/dev/random/random_harvestq.h
52	I believe so.