random: Don't complain noisily when an entropy source is slow
ClosedPublic
Actions

Authored by cem on May 8 2019, 3:31 AM.

Details

Reviewers

markm
delphij
mjg
imp
jmg

Group Reviewers

O3: Kernel Random Numbers Generator

Commits

rS347329: random(4): Don't complain noisily when an entropy source is slow

Summary

Mjg@ reports that RDSEED (r347239) causes a lot of logspam from this printf,
and I don't feel that it is especially useful (even ratelimited). There are
many other quality/quantity checks we're not performing on entropy sources;
lack of high frequency availability does not disqualify a good entropy
source.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

cem created this revision.May 8 2019, 3:31 AM

Harbormaster completed remote builds in B24150: Diff 57166.May 8 2019, 3:31 AM

I'm OK with this.

This revision is now accepted and ready to land.May 8 2019, 7:26 AM

I think the change as-is is fine.

(Note that if I was you I'd probably if (bootverbose) or #ifdef DIAGNOSTIC instead of removing the whole printf altogether).

sys/dev/random/random_harvestq.c
250 ↗	(On Diff #57166)	[comment only, not requesting for change] I wonder if we should actually break here; the driver typically performs a few retries already, and it looks like we are just beating a tired horse here, therefore it _might_ be reasonable to just skip the source until the next random_sources_feed() call from the random kthread, where it would yield execution.

• ian added a subscriber: • ian.May 8 2019, 2:41 PM

• ian added inline comments.

sys/dev/random/random_harvestq.c
250 ↗	(On Diff #57166)	Indeed, continuing to pester a slow source has no value. I ran into this same thing while trying to improve the rpi's hardware rng driver, and found that this code just repeatedly slammed the driver which was never going to be able to provide more data because the hardware source is just inherently slow (it also is more of a seed generator than a full on-demand source of entropy).

Thanks Mark and Xin!

In D20195#435170, @delphij wrote:

(Note that if I was you I'd probably if (bootverbose) or #ifdef DIAGNOSTIC instead of removing the whole printf altogether).

Hm, I will plan to just remove it for now. I don't find DIAGNOSTIC useful myself, and I think this print is too noisy as-is for bootverbose. It might make sense under bootverbose with a rate limit (but even that seems too spammy, imo, by bootverbose standards).

I propose we should log it at most once per registered entropy source, and if we think it is worth diagnosing, keep basic statistics on the entropy source such as number of requests, number of short results, number of empty results, and total byte count returned (the latter allows us to calculate average bytes returned per request). Those stats could be exposed via sysctl or something. They would be enabled with a sysctl/tunable knob, probably off by default, because they maybe expose some details of entropy sources that could plausibly be useful for an attacker.

<tangent>
Along the same vein, we could do some basic statistic analysis of entropy sources as we go, and look for weaknesses. Even a basic analysis would catch something like the older AMD RDRAND-after-suspend bug (every value is 0xffffffffffffffff). That said, anything complicated like the NIST tests is too expensive and brittle to put in the kernel. So it might make sense to expose a "bypass" interface to the superuser, where we poll the available sources and pass it to userspace, taking care not to reuse anything we send to userspace in Fortuna.
</tangent>

For now, I'll get the easy bandaid in place.

We should eventually have a chat with John Baldwin about what the right kernel interfaces for randomness are at some point, too. He doesn't like the idea that arc4random might block if we eventually require strong initial seeding — it can violate locking invariants and create deadlock, for example. Different early boot consumers might want different behavior. I think he would prefer if that was available as part of the API (perhaps similarly to getrandom(2) GRND_NONBLOCK). There are also consumers like TCP that really don't care if they get strong random numbers during early boot (they're filtering it down to a 32-bit value anyway), but could benefit from a sequence that cannot be predicted by an attacker (i.e., repeated SHA(), or Chacha, seeded with some weakly random value, rather than rand(3)). So it might be good to expose some new APIs that aren't arc4random (which also gives us a chance to think of better names). I don't have any concrete proposals in that direction, but it's something to think about.

sys/dev/random/random_harvestq.c
250 ↗	(On Diff #57166)	I agree that makes sense if our model of entropy sources that return zero is similar to `RDSEED` — TRNG entropy source FIFO is temporarily exhausted. (I don't know if makes sense for all sources, nor have I attempted to quantify the cost of spinning.) One caveat, `break` here wouldn't remove the spin in the case of early random seeding as implemented in D19928 . In that case, though, I am not especially worried about spinning. Anyway, food for thought. I appreciate it!