random: Change the entropy harvest event queuing scheme
ClosedPublic
Actions

Authored by markj on Jul 1 2025, 3:35 PM.

Details

Reviewers

cem

Group Reviewers

csprng

Commits

rG9280aa57a5d2: random: Change the entropy harvest event queuing scheme
rG9940c974029b: random: Change the entropy harvest event queuing scheme

Summary

The entropy queue stores entropy gathered from environmental sources.
Periodically (every 100ms currently), the random kthread will drain this
queue and mix it into the CSPRNG's entropy pool(s).

The old scheme uses a ring buffer with a mutex to serialize producers,
while the sole consumer, the random kthread, avoids using a mutex on the
basis that no serialization is needed since nothing else is updating the
consumer index. On platforms without total store ordering, however,
this isn't sufficient: when a producer inserts a queue entry and updates
ring.in, there is no guarantee that the consumer will see the updated
queue entry upon observing the updated producer index. That is, the
update to ring.in may be visible before the updated queue entry is
visible. As a result, we could end up mixing in zero'ed queue entries,
though this race is fairly unlikely in practice given how infrequently
the kthread runs.

The easiest way to fix this is to make the kthread acquire the mutex as
well, and hold it while processing queue entries. However, this might
result in a long hold time if there are many queue entries, and we
really want the hold times to be short, e.g., to avoid delaying
interrupt processing.

Instead, define two buffers, always with one designated as the "active"
buffer. Producers queue entries in the active buffer, and the kthread
uses the mutex to atomically flip the two buffers, so it can process
entries from the inactive buffer without holding the mutex. This
requires more memory, but keeps mutex hold times short and lets us
keep the queue implementation very simple.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

markj created this revision.Jul 1 2025, 3:35 PM

Herald added a subscriber: imp. · View Herald TranscriptJul 1 2025, 3:35 PM

markj requested review of this revision.Jul 1 2025, 3:35 PM

Herald added a subscriber: obrien. · View Herald TranscriptJul 1 2025, 3:35 PM

Harbormaster completed remote builds in B65156: Diff 157795.Jul 1 2025, 3:35 PM

markj added a parent revision: D51111: random: Move entropy harvest queue lock macros to random_harvestq.c.Jul 1 2025, 3:35 PM

markj added a child revision: D51113: random: Define a macro for getting the CPU cycle count.

No real objection to this overall approach, but we could also just fix the MPSC queue, right? (release stores for ring.in on the producer side and acquire loads on the consumer side.) (More generally, maybe it would be good to have some sort of generic, correct MPSC implementation that this subsystem can use, instead of the bespoke thing.)

In D51112#1166631, @cem wrote:

No real objection to this overall approach, but we could also just fix the MPSC queue, right? (release stores for ring.in on the producer side and acquire loads on the consumer side.) (More generally, maybe it would be good to have some sort of generic, correct MPSC implementation that this subsystem can use, instead of the bespoke thing.)

Yeah, I started that way, but it felt a bit silly to try and optimize for a consumer that runs at 10Hz. Adding barriers would also add some overhead for insertions into the queue, and we generally want the producer side to be as cheap as possible (global mutex notwithstanding).

We could use buf_ring.h here, that provides a MPSC queue. But switching to buf_ring.h is a larger change of course, and I have no evidence that it's really warranted. In practice, we don't seem to get a large number of events at least on the amd64 systems I've tested so far. Interrupt sources need to opt in to entropy collection, and it seems that relatively few do. There are some other sources that might have a high rate but aren't universal, e.g., RANDOM_NET_NG does nothing on systems which don't use netgraph, which is probably most of them.

So, I went with a simpler approach.

In D51112#1166640, @markj wrote:

...

Sounds reasonable to me.

cem accepted this revision.Jul 4 2025, 2:04 PM

cem added inline comments.

sys/dev/random/random_harvestq.c
426	Do we need to zero-init `pos` as well?

This revision is now accepted and ready to land.Jul 4 2025, 2:04 PM

markj marked an inline comment as done.Jul 7 2025, 2:57 PM

markj added inline comments.

sys/dev/random/random_harvestq.c
426	It's not strictly necessary. I specified the index here just to be explicit about which buffer is active initially, but the structure is zero-initialized.