Details

Reviewers

ae
jmg
markm
delphij
imp
lstewart
cem

Summary

This is an attempt to use arc4random() more extensively in the kernel.
Particularly interesting may be ipfw and congestion control, although
the uses there seem to be little related with security.

I excluded using arc4random in contrib'ed code, in tests, in devices and in the screensavers. devices may have to be investigated later.

Test Plan

I rebuilt my kernel and I am running it.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 28321
Build 26424: arc lint + arc unit

Event Timeline

pfg updated this revision to Diff 16544.May 18 2016, 8:21 PM

pfg retitled this revision from to Make more use of arc4random() in the kernel..

pfg updated this object.

pfg edited the test plan for this revision. (Show Details)

Herald added a subscriber: imp. · View Herald TranscriptMay 18 2016, 8:21 PM

Hello;

The improvement is probably not huge but given that we can provide better randomness ... why not? :)

This touches a bunch of code in the kernel and I want to make sure the code was not side-stepped in the past from conversion for some reason. Thanks for reviewing.

pfg updated this object.May 18 2016, 8:49 PM

pfg edited the test plan for this revision. (Show Details)

In D6442#136542, @pfg wrote:

Hello;

The improvement is probably not huge but given that we can provide better randomness ... why not? :)

Without numbers quantifying the impact of switching to arc4random(), it is difficult to agree this is a good idea or provide any useful feedback. I don't possess any intuitive notion of the overhead vs entropy improvement compared with random().

In D6442#136569, @lstewart wrote:

In D6442#136542, @pfg wrote:

Hello;

The improvement is probably not huge but given that we can provide better randomness ... why not? :)

Without numbers quantifying the impact of switching to arc4random(), it is difficult to agree this is a good idea or provide any useful feedback. I don't possess any intuitive notion of the overhead vs entropy improvement compared with random().

From my measurements in filesystems, the performance difference was not quantifiable. We just happen to use random numbers very rarely in filesystems and it basically happens during formatting.
Also take into account that:

The regular random() is rather slow.
Modern processors (which I don't have access to) should take stuff like arc4random() rather easily.

It very much depends on the application and the type of load so I am not suggesting just committing this patch. It has to be evaluated by different developers and it's unlikely we could do this for, say, 11-Release. Perhaps we could incorporate it early in the 12-current cycle though which would give it ample testing for a year or two.

For now I would appreciate reviews of the application: there was some scaling that had to be adjusted for the congestion algorithms. Are those OK?

I think we should get Chacha20, getentropy(), etc. first.

Please do not use arc4random() % but instead use arc4random_uniform (we should probably get ChaCha20 in base first) by the way.

sys/net/if_spppsubr.c
4346 ↗	(On Diff #16544)	I don't think this code is right. Why can't the whole challenge be generated by e.g. arc4random_buf (not in kernel but we should have it)?

This revision now requires changes to proceed.May 20 2016, 6:27 PM

In D6442#137246, @delphij wrote:

I think we should get Chacha20, getentropy(), etc. first.

I think that is a PR with a patch for that, but an important part
of the patch touches the VM. That needs a review from the experts.

Please do not use arc4random() % but instead use arc4random_uniform (we should probably get ChaCha20 in base first) by the way.

We don't have arc4random_uniform in the kernel.

I am considering dropping completely this patch and instead rewrite random() to use a better and faster, but not crypto-safe, algorithm based on KISS.

pfg mentioned this in D6488: libkern: srandom() seed 0: produces predictable result..May 21 2016, 10:45 PM

emaste added a subscriber: emaste.Mar 21 2017, 4:50 PM

Update: some code has already been replaced.

Herald added a subscriber: donner. · View Herald TranscriptDec 25 2019, 4:28 PM

Harbormaster completed remote builds in B28316: Diff 65979.Dec 25 2019, 4:28 PM

Add cem@.

sys/net/if_spppsubr.c
4346 ↗	(On Diff #16544)	This code doesn't exist anymore.

Drop some replacements wehre the different retrn value type is important.

Harbormaster completed remote builds in B28317: Diff 65981.Dec 25 2019, 5:28 PM

Drop one more case where the return type matters.

Harbormaster completed remote builds in B28318: Diff 65982.Dec 25 2019, 6:07 PM

I think it would be more straightforward to divide this into reviews by subject code area. Feel free to cc me in all of those if you do so. I’m happier to see more use of arc4random when it’s a good fit.

As far as earlier concerns: arc4random(9) and (3) are now chacha-based in 12 and newer; but we still lack arc4random_uniform() in CURRENT.

In D6442#502158, @cem wrote:

I think it would be more straightforward to divide this into reviews by subject code area. Feel free to cc me in all of those if you do so. I’m happier to see more use of arc4random when it’s a good fit.

I may still split it. but its not so big anymore :). There were some cases where it was actually wrong to use due to the return type difference.

As far as earlier concerns: arc4random(9) and (3) are now chacha-based in 12 and newer; but we still lack arc4random_uniform() in CURRENT.

TBH, arc4random_uniform() is not too important: it basically applies to the cases where one uses the modulus operation to limit the range, so if we do bring it the replacement could be done automatically with coccinelle.

sys/geom/multipath/g_multipath.c
901	This one is wrong: random returns a u_long, while arc4random(9 returns uint32.

cem added inline comments.Dec 25 2019, 6:52 PM

sys/kern/kern_fail.c
170	I believe this should remain signed. Otherwise 100% probability is unrepresentable. E.g., `-1` currently indicates that.
571	Should use arc4random_uniform(), which still needs to be imported to the kernel.
sys/kern/kern_synch.c
610	This seems performance insensitive to me; approved as far as I’m concerned. 👍 (It should use arc4random_uniform instead, but the modulo bias is an existing issue and there’s no obvious connection between this and key material to me, so I wouldn’t gate this change on availability of the arc4random_uniform(9) KPI.)
sys/net/altq/altq_classq.h
155	Orthogonal change.
164	I don’t know to what extent unpredictability matters in this path or the real world performance impact of arc4random vs random. This should be reviewed by someone more familiar with altq, and use arc4random_uniform(9).
sys/netinet/cc/cc_cdg.c
503 ↗	(On Diff #65982)	I am hesitant to make type changes like this. They’re unrelated to using arc4random.
515 ↗	(On Diff #65982)	This is now a completely different expression in a way that has nothing to do with s/random/arc4random/. Ditto earlier concerns about not being an expert in this code; I don’t know how significant unpredictability is in this context.
sys/netinet/cc/cc_chd.c
88 ↗	(On Diff #65982)	Isn’t there some header this code can include for the real RANDOM_MAX? I don’t like defining well known constants like this in arbitrary C files.
161 ↗	(On Diff #65982)	Ditto cc_cdg remarks.
sys/netinet/cc/cc_hd.c
83 ↗	(On Diff #65982)	Ditto earlier
122 ↗	(On Diff #65982)	Ditto
sys/netpfil/ipfw/ip_dn_io.c
408 ↗	(On Diff #65982)	I don’t believe unpredictability matters here.
514 ↗	(On Diff #65982)	This change makes drop half as likely. Also I doubt this is entropy sensitive. Maybe some ipfw expert can chime in
586 ↗	(On Diff #65982)	I have no idea if this change is appropriate.
sys/vm/memguard.c
450	I don’t believe there’s much reason to use arc4random here.

Hmm ...
When arc4random was introduced in OpenBSD the idea was to have one unique random algorithm to replace them all. This patch is old and reflects that idea. I am now thinking , that that is not the real objective for us.

I have separated the interesting, networking-related, changes in different reviews. Most of the remaining changes are just replacing random in conjunction with modulus operations to limit the range, and are not of high value. I'll probably close the review.

sys/kern/kern_fail.c
170	Probablity should never be negative. In this case 100% probablility is 1 million.
sys/net/altq/altq_classq.h
155	Not completely orthogonal: n will hold an unsigned value (below), which could be higher than MAX_INT if it were not for the modulus operation. i is an index up to n, so it should have the same type..
sys/netinet/cc/cc_cdg.c
515 ↗	(On Diff #65982)	The real change happens in the comparison in the next line. Yes, the change is tricky and has been moved to its own differential: D22924
sys/netinet/cc/cc_chd.c
88 ↗	(On Diff #65982)	This is defined near to where it is used, and while it should be using RAND_MAX for random(), I see why any such value depends on the specific randomness function used. It probably should be in a single header if the code author had set on use it, with the corresponding function, consistently all over the code .
161 ↗	(On Diff #65982)	Yeah... again this on its own differential now.
sys/netpfil/ipfw/ip_dn_io.c
514 ↗	(On Diff #65982)	I indeed moved it to its own differential: D22925
sys/vm/memguard.c
450	This is the case with basically all the above changes ;).

In D6442#502200, @pfg wrote:

Hmm ...
When arc4random was introduced in OpenBSD the idea was to have one unique random algorithm to replace them all. This patch is old and reflects that idea. I am now thinking , that that is not the real objective for us.

Well, the idea is not purely unifying the generators, but to use the same CSPRNG in as much as possible places in kernel to make it harder to predicate the next output, by using these arc4random() calls as another form of randomness introduced to the entropy source (as such, to effectively use the feature the calls has to obtain small quantities of outputs from arc4random()).

That's said, I believe we still want to use arc4random() for, at least, the code paths that are not performance critical and do performance analysis and decide carefully if they should be used in the critical ones.

Drop stuff that was moved to other differentials.

Harbormaster completed remote builds in B28321: Diff 65986.Dec 25 2019, 8:36 PM

Drop revision: none of this is security critical and replacing good old random for the sake of removing it is not an objective.

In D6442#502210, @delphij wrote:

In D6442#502200, @pfg wrote:

Hmm ...
When arc4random was introduced in OpenBSD the idea was to have one unique random algorithm to replace them all. This patch is old and reflects that idea. I am now thinking , that that is not the real objective for us.

Well, the idea is not purely unifying the generators, but to use the same CSPRNG in as much as possible places in kernel to make it harder to predicate the next output, by using these arc4random() calls as another form of randomness introduced to the entropy source (as such, to effectively use the feature the calls has to obtain small quantities of outputs from arc4random()).

That's said, I believe we still want to use arc4random() for, at least, the code paths that are not performance critical and do performance analysis and decide carefully if they should be used in the critical ones.

Theo de Raadt was at some point argumenting that arc4random was extremely fast: he claimed the same perfomance as memset ( but I wouldn't trust such claim). In all the cases above, the changes have no performance implications simply because we use random numbers very rarely and punctually so I doubt we would notice any difference. Ultimately, and speaking of the above cases, performance doesn't matter.

What disappoints me, and made me drop the change, is that the quality of randomness in the above cases is of very little relevance. The great advantage of arc4random(), as I see it, is doubling the range of randomness When you use the modulus operation to restrict the range you lose such "randomness granularity" and there is no advantage in using arc4random. Furthermore, the value itself has no consequence security-wise. Having a random:_uniform function is also of little value in these cases, we should probably not waste out time adding such interface.

Make more use of arc4random() in the kernel.
AbandonedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 65986

sys/geom/multipath/g_multipath.c

sys/kern/kern_fail.c

sys/kern/kern_synch.c

sys/net/altq/altq_classq.h

sys/vm/memguard.c

Make more use of arc4random() in the kernel.AbandonedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 65986

sys/geom/multipath/g_multipath.c

sys/kern/kern_fail.c

sys/kern/kern_synch.c

sys/net/altq/altq_classq.h

sys/vm/memguard.c

Make more use of arc4random() in the kernel.
AbandonedPublic
Actions

Revision Contents
Changeset List