Details

Reviewers

jeff
cem
emaste
mjg
jmg
sbruno
gallatin
delphij

Group Reviewers

secteam
O3: Kernel Random Numbers Generator	(Owns No Changed Paths)

Commits

rS334450: Reduce overhead of entropy collection

Summary

move mask to hot primarily read only cache line
check mask first to avoid overhead of a function call
don't collect entropy on ethernet by default

This is rx of ~2.9Mpps on a 2x8x2 SKL
https://people.freebsd.org/~mmacy/2018.05.11/udprx1.svg

This is rx of ~4Mpps
https://people.freebsd.org/~mmacy/2018.05.11/udprx_norandom.svg

We're suffering major performance degradation doing this.

However, there's more to unpack here:

random_harvest_queue(m, sizeof(*m), 2, RANDOM_NET_ETHER);

We're sampling harvest size (8 bytes) from the start of the mbuf. So we're literally only ever getting the m_next pointer (the first field of the mbuf). Assuming this were actually set this would be a poor source of entropy. There are only a few thousands of mbufs allocated at low activity. However, unless you're running with jumbo frames, m_next will _always_ be null on ether_input. And, anyone running in production using jumbo frames will have turned this off. So, in the entire time since this change went in we've never gotten any entropy from this source.

Correction: it will be non-NULL on LRO packets. Nonetheless, the point holds.

More generally we need to see a lot more rigor in the justifications surrounding random's design, entropy collection needs, and entropy collection points.

How much entropy does random actually need?

How much is it actually getting from a given source?

Why aren't we using hardware entropy sources? When can we?

There are statistical tests to qualify these things. I think it's quite reasonable that we require a more substantive justification for the overhead than any that have been given to date. We're allowing very poor optics for FreeBSD on the basis of what appears to at best be conjecture.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Skipped

Unit

Tests Skipped

Build Status

Buildable 16941

Event Timeline

mmacy created this revision.May 23 2018, 12:45 AM

Herald added subscribers: ae, imp. · View Herald TranscriptMay 23 2018, 12:45 AM

Harbormaster completed remote builds in B16772: Diff 42865.May 23 2018, 12:45 AM

mmacy edited the summary of this revision. (Show Details)May 23 2018, 12:47 AM

cem added inline comments.May 23 2018, 12:54 AM

sys/dev/random/random_harvestq.c
490–494	KASSERT references removed parameter
sys/sys/random.h
119	I don't think the predict_false's make sense here.
158	comment mismatch

We only need to generate about 8 bytes of entropy per second total for the spiking of the algorithms...

mmacy added a reviewer: secteam.May 23 2018, 5:07 PM

mmacy edited the summary of this revision. (Show Details)May 23 2018, 5:14 PM

Pending trivial cleanups pointed out by @cem , approved.

This revision is now accepted and ready to land.May 23 2018, 11:15 PM

seanc added a subscriber: seanc.May 23 2018, 11:29 PM

Update with cem's comments
Will commit 5/31 barring further comments.

This revision now requires review to proceed.May 30 2018, 9:13 PM

Harbormaster completed remote builds in B16934: Diff 43179.May 30 2018, 9:13 PM

mjg added inline comments.May 30 2018, 9:16 PM

sys/sys/random.h
119	why do you think it does not make sense? it helps the compiler avoid a forward jump in the common case of the feature not being enabled - instead it will be more likely to insert a jmp to a section which calls random_harvest_queue_ and then jumps back. a case which arguably is already heavily pessimized. so i would argue it makes excellent sense.

gallatin added a reviewer: gallatin.May 30 2018, 9:18 PM

sbruno accepted this revision.May 30 2018, 9:19 PM

This revision is now accepted and ready to land.May 30 2018, 9:19 PM

If people want to use ethernet entropy harvesting, can we do something to make it more useful in a separate change? I was going to initially suggest the ether dst addr, but that's not very random either.

In any cases, I think these changes are a huge win. Especially moving the collection mask outside a cacheline that is written in the hot path.

LGTM modulo nit about only disabling on amd64.

sys/sys/random.h
102–105	This special casing is a little dubious — e.g., PPC64 has TRNG since POWER7 and is more than capable of driving the same high-throughput NICs as amd64. I'm leaning towards just disable it for all arch. Maybe add a runtime test to detect other sources of entropy and enable it if there's nothing else. Then again, the data is not very random (or not at all random for non-jumbo frames? Not sure we ever determined that), so maybe it's useless there too.
119	Why do you think that features being disabled is the common case? E.g., I would expect pure sources to be both enabled and sampled reasonably often. The __predict_false will mispredict those. Additionally, CPU branch predictors have been smart enough to predict the branch correctly for disabled sources since the 90s. I think predict_false/true should only be used when we're actually going to be smarter than the runtime branch predictor.

disable ethernet entropy collection by default

This revision now requires review to proceed.May 30 2018, 10:55 PM

Harbormaster completed remote builds in B16938: Diff 43185.May 30 2018, 10:55 PM

Given that there is trivially little if any entropy coming from mbufs is there a reason we're leaving this callsite at all? has anyone from secteam commented?

In D15526#330095, @jeff wrote:

Given that there is trivially little if any entropy coming from mbufs is there a reason we're leaving this callsite at all? has anyone from secteam commented?

That might be the right next step, or maybe it can be improved to be useful. I don't think we need to go that far in the first change.

sys/sys/random.h
159	comment mismatch cropped up again

The change LGTM in principal but I'd like to request a few minor (cosmetic) change.

Please add RANDOM_ENABLE_ETHER to sys/conf/NOTES; this makes sure we have build code either way in tinderbox builds.

Please fix the RANDOM_ENABLE_ETHER / RANDOM_ENABLE_UMA mismatch in sys/sys/random.h, as pointed out by @cem .

Optionally, if I was you I'd probably define random_harvest_queue_ether differently by omitting the RANDOM_NET_ETHER parameter to the macro (since the the macro's name already give a clear hint that it's for ethernet entropy, and the sole purpose of its existence is to provide a way to fully bypass the collection code).

This revision now requires changes to proceed.May 31 2018, 2:08 AM

Accept as secteam@ -- talked with so@ and we have no objection to the change in principal.

Apply changes requested by @delphij and @cem

Harbormaster completed remote builds in B16940: Diff 43191.May 31 2018, 3:23 AM

fix typo in NOTES

Harbormaster completed remote builds in B16941: Diff 43192.May 31 2018, 3:25 AM

Further clarify the intent of RANDOM_ENABLE_ETHER

Harbormaster completed remote builds in B16942: Diff 43193.May 31 2018, 3:31 AM

cem accepted this revision.May 31 2018, 3:42 AM

delphij accepted this revision.May 31 2018, 4:25 AM

This revision is now accepted and ready to land.May 31 2018, 4:25 AM

Closed by commit rS334450: Reduce overhead of entropy collection (authored by mmacy). · Explain WhyMay 31 2018, 9:53 PM

This revision was automatically updated to reflect the committed changes.

Owners added a reviewer: O3: Kernel Random Numbers Generator.May 31 2018, 9:53 PM

In D15526#330054, @gallatin wrote:

If people want to use ethernet entropy harvesting, can we do something to make it more useful in a separate change? I was going to initially suggest the ether dst addr, but that's not very random either.

In any cases, I think these changes are a huge win. Especially moving the collection mask outside a cacheline that is written in the hot path.

as I have suggested, doing a sampling is a good choice for those that want/need it.. i.e. use 10 packets/second (or less). Use the previous second's pps to calculate the n for selecting every nth packet to use such that it comes out to something like 10pps. Then you use the entire packet, not just the header w/o packet data. Gives us good entropy, makes minimal impact on system overhead, and a huge improvement.

In D15526#330700, @jmg wrote:

as I have suggested, doing a sampling is a good choice for those that want/need it.. i.e. use 10 packets/second (or less). Use the previous second's pps to calculate the n for selecting every nth packet to use such that it comes out to something like 10pps.

Sure, this seems plausible. The existing pps primitive we have probably doesn't work well on many-core systems — I think it would ping pong around with many RX queues — but one can easily imagine enhancing it to be suitable for this goal.

Then you use the entire packet, not just the header w/o packet data.

random_harvest_queue() truncates to the first 8 bytes of input due to constraint on the size of a hc_entropy_ring entry (harvest_event::he_entropy[HARVESTSIZE]). We would have to change that behavior (with potentially wide-reaching implications), or add a new primitive that does not truncate, or pre-mix packet bytes together (simple xor probably fine).

And after all that work, I'm still not sure packet contents are an especially good source of random data. It's harmless to mix in crappy random data after Fortuna is seeded, but spamming the entropy pool with crappy entropy pre-seed may unblock random before it is truly unpredictable, since our unblock heuristic is basically just a byte count of seed data.

While certainly better than what we were doing, global pps accounting would not be efficient multi-core. I'd much rather have it collect the first 10 packets or so every second.

In D15526#330707, @mmacy wrote:

While certainly better than what we were doing, global pps accounting would not be efficient multi-core. I'd much rather have it collect the first 10 packets or so every second.

That is an implementation detail that I did not specify. If you have a 8 queue nic, maybe pick 1 or 2 packets per queue, or something.. or look at traffic per nic and balance it between each, etc. There are MANY solutions that will get you close to 10pps (or maybe even less) that are perfectly workable for efficient multi-core machines.

Collecting the first 10 packets or so every second is not a great idea. With things like tcp off-load which bursts packets, it's common for bursts of packets to be correlated.

Collecting the first 10 packets or so every second is not a great idea. With things like tcp off-load which bursts packets, it's common for bursts of packets to be correlated.

Yes. True. Nonetheless, I'm much more interested in the questions I raised in the review. All of these implementation details are secondary to those.

In D15526#330717, @mmacy wrote:

Collecting the first 10 packets or so every second is not a great idea. With things like tcp off-load which bursts packets, it's common for bursts of packets to be correlated.

Yes. True. Nonetheless, I'm much more interested in the questions I raised in the review. All of these implementation details are secondary to those.

I've written answers to most of those, or at least attempted to discuss it before. But short answers are:

How much entropy does random actually need?

"None". Once a 256-bit seed has been generated at install time and no further entropy is needed. I'm fine w/ small amounts of additional entropy to prevent evil maid compromises (e.g. machine is powered off, maid clones hard drive, next boot, rng is "predictable"). Reference: https://blog.cr.yp.to/20140205-entropy.html

How much is it actually getting from a given source?

Due to above, it doesn't/shouldn't mater. Depending upon kernel sources other than the keyboard for generating initial seed is likely to be very broken.

Why aren't we using hardware entropy sources? When can we?

We are in a few cases. But there is also the case where people confuses PRNGs with TRNGs. Only the later are useful. RDRAND is now being used, I am working on setting up a lab (for everyone to access) that has various hardware so that drivers can be more easily be written and tested (no ETA though).

Oh, note that the above are my opinions, and they differ significantly from most of the rest of the FreeBSD team. Mainly why I haven't spent much time working on the rng system,

In D15526#330724, @jmg wrote:

Oh, note that the above are my opinions, and they differ significantly from most of the rest of the FreeBSD team. Mainly why I haven't spent much time working on the rng system,

That mostly looks reasonable to me — the big caveat is around initial seeding. Especially on embedded systems with readonly media, but also "evil maid" scenarios, as you point out. I'm not part of secteam, though.

reduce overhead of entropy collection
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 43192

sys/conf/NOTES

sys/conf/options

sys/dev/random/random_harvestq.c

sys/net/if_ethersubr.c

sys/sys/random.h

reduce overhead of entropy collectionClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 43192

sys/conf/NOTES

sys/conf/options

sys/dev/random/random_harvestq.c

sys/net/if_ethersubr.c

sys/sys/random.h

reduce overhead of entropy collection
ClosedPublic
Actions

Revision Contents
Changeset List