Page MenuHomeFreeBSD

libc/resolv: get rid of MD5
ClosedPublic

Authored by fuz on Mon, Sep 29, 5:53 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Oct 25, 9:37 PM
Unknown Object (File)
Sat, Oct 25, 5:43 PM
Unknown Object (File)
Sat, Oct 25, 3:24 PM
Unknown Object (File)
Sat, Oct 25, 3:11 PM
Unknown Object (File)
Sat, Oct 25, 12:42 PM
Unknown Object (File)
Sat, Oct 25, 2:23 AM
Unknown Object (File)
Sat, Oct 18, 7:06 PM
Unknown Object (File)
Sat, Oct 18, 6:19 AM
Subscribers

Details

Summary

MD5 is used by libc/resolv to generate a random sequence id from a
current time stamp. Replace this convoluted mechanism with a call
to arc4random(). This permits us to entirely drop MD5 from libc,
simplifying the MD5 rework proposed in D45670.

Discussed with @markj at EuroBSDcon 2025.

See also: D45670
Event: EuroBSDcon 2025

Test Plan

passes unit tests, installation in jail uneventful

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

fuz requested review of this revision.Mon, Sep 29, 5:53 PM
fuz created this revision.

One question on this one is that it drops the symbol __res_rndinit. I was unable to find any external user of this symbol and it seems like it is not meant to be used externally, but it's technically an ABI break. If desired I can go stub it out.

kevans added inline comments.
lib/libc/resolv/res_init.c
739

You do still need to leave a stub and sym_compat (iirc) it to res_rndinit@FBSD_1.4.

  • libc/resolv: re-add __res_rndinit as a compat symbol

After looking at this some more, I suspect that using arc4random() directly here is not necessarily a good idea. Aside from wanting random TXIDs, we also want to avoid repeating them for as long as possible, so really this calls for an LCG periodic over 16 bits (which would require care to implement reseeding upon fork()) or a custom hash of the time of day. Though, it's hard to argue that the current scheme is very careful about TXID reuse...

lib/libc/resolv/res_init.c
732

arc4random_uniform() would be better here (though it doesn't really matter when the bound is a power of 2), but see my other comment.

After looking at this some more, I suspect that using arc4random() directly here is not necessarily a good idea. Aside from wanting random TXIDs, we also want to avoid repeating them for as long as possible, so really this calls for an LCG periodic over 16 bits (which would require care to implement reseeding upon fork()) or a custom hash of the time of day. Though, it's hard to argue that the current scheme is very careful about TXID reuse...

Let's separate the discussion of “remove MD5 from libc” and “can we make transaction ID generation better.” This changeset at least doesn't make it any worse than it currently is. As for forks, I don't thing the resolver code can cope with these at all, as both forked processes may end up fighting over one UDP socket and it's random which of the two gets the DNS response. Disambiguation by transaction ID doesn't seem to help in this case. For other cases, the disambiguation by client source port should already be sufficient (doesn't the kernel take care to not reuse the same source port too quickly?)

lib/libc/resolv/res_init.c
732

I didn't use arc4random_uniform() as it doesn't add anything here, except forcing reference to the function for network applications when statically linking the libc.

Let's separate the discussion of “remove MD5 from libc” and “can we make transaction ID generation better.”

Yes, ok, I agree that this change is fine for your purposes, I just went down a rabbit hole when reading about this problem.

As for forks, I don't thing the resolver code can cope with these at all, as both forked processes may end up fighting over one UDP socket and it's random which of the two gets the DNS response.

I just meant that with an ad-hoc PRNG we want to make sure that a forked child is careful to reseed to avoid a situation where both processes generate the same sequence of IDs.

For other cases, the disambiguation by client source port should already be sufficient (doesn't the kernel take care to not reuse the same source port too quickly?)

Hmm, not that I can see. in_pcb_lport_dest() records the last allocated port number in the pcbinfo, but nothing uses it, and it uses arc4random() to allocate a starting point for the search. For TCP sockets I expect the kernel will not recycle the port number too quickly because of the TIME-WAIT state, but for UDP I believe we have no such mechanism.

lib/libc/resolv/res_init.c
732
This revision is now accepted and ready to land.Fri, Oct 3, 10:43 PM
This revision was automatically updated to reflect the committed changes.