Page MenuHomeFreeBSD

random(4): Make entropy source deregistration safe
ClosedPublic

Authored by cem on Nov 21 2019, 9:04 PM.

Details

Summary

Allow loadable modules that provide random entropy source(s) to safely
unload. Prior to this change, no driver could ensure that their
random_source structure was not being used by random_harvestq.c for any
period of time after invoking random_source_deregister().

This change converts the source_list LIST to a ConcurrencyKit CK_LIST and
uses an epoch(9) to protect typical read accesses of the list. The existing
HARVEST_LOCK spin mutex is used to safely add and remove list entries.
random_source_deregister() uses epoch_wait() to ensure no concurrent
source_list readers are accessing a random_source before freeing the list
item and returning to the caller.

Callers can safely unload immediately after random_source_deregister()
returns.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

cem created this revision.Nov 21 2019, 9:04 PM
cem added subscribers: emaste, markj, jhb.Nov 21 2019, 9:07 PM

I would like to get more eyeballs from anyone more familiar with CK and epoch(9) to sanity check the idea. If you know of anyone who would be a good reviewer, please let me know or add them directly. Thanks.

sys/dev/random/random_harvestq.c
249 ↗(On Diff #64694)

We should probably turn SI_SUB_TASKQ + 1's myriad copies into an SI_SUB_EPOCH eventually.

markm accepted this revision.Nov 22 2019, 7:57 AM
markm added a subscriber: markm.
markm added inline comments.
sys/dev/random/random_harvestq.c
249 ↗(On Diff #64694)

That SI_SUB_EPOCH is looking really important now.

This revision is now accepted and ready to land.Nov 22 2019, 7:57 AM
markj added inline comments.Nov 22 2019, 3:42 PM
sys/dev/random/random_harvestq.c
299 ↗(On Diff #64694)

This function may acquire an sx lock if RANDOM_LOADABLE is configured, and that's not permitted in an epoch section.

Hmm. We acquire that lock earlier in this function though, so I guess we're recursing.

cem added inline comments.Nov 22 2019, 4:47 PM
sys/dev/random/random_harvestq.c
249 ↗(On Diff #64694)

I’ll chase that up in a separate revision.

299 ↗(On Diff #64694)

Yeah. The recursion should be permissible, if ugly? It may also be possible to convert the (random(4)) loadable module system into an epoch system, I think. One step at a time.

(RANDOM_LOADABLE gets periodically broken for months and we don’t get emails about it. Maybe someone (juniper?) would care if we shipped a release with it broken. But: not hugely popular and I’d love to not have it.)

Does this approach seem sane otherwise?

jhb added inline comments.Nov 22 2019, 5:20 PM
sys/dev/random/random_harvestq.c
299 ↗(On Diff #64694)

I wonder if this epoch can actually replace the sx lock directly? You will still need a separate lock for the exclusive access, but perhaps the exclusive one wouldn't have to be an sx anymore.

On a cursory glance, it looks like this would work. I think you would need an epoch_wait() after clearing the function pointers during unload. I'm not sure you even need an xlock at all. You really just need a barrier between clearing the function pointers on unload and the epoch_wait(), but epoch_wait() probably has a sufficient barrier in its implementation already. This would remove the recursion altogether.

cem added inline comments.Nov 22 2019, 5:43 PM
sys/dev/random/random_harvestq.c
299 ↗(On Diff #64694)

Yeah, that's what I had in mind. Do you think it makes sense to do all of that as part of this revision, or separately?

cem added inline comments.Nov 22 2019, 7:21 PM
sys/dev/random/random_harvestq.c
299 ↗(On Diff #64694)

Hm, in investigating further, I don't believe a non-preemtible epoch can replace the CONFIG sx lock, because the random_reader_context calls it wraps may block. So maybe it makes sense to use a separate, preemptible epoch for that? Either way, I'd like to leave it for a future revision.

jhb added a comment.Nov 22 2019, 7:30 PM

Probably ok to do as a followup. I think though that in the followup what yo would have to do is use a single epoch and just have it be preemptive in the RANDOM_LOADABLE case, maybe you can just do that in this commit as well to appease any assertions that might fire?

cem added a comment.Nov 22 2019, 7:48 PM
In D22489#492132, @jhb wrote:

Probably ok to do as a followup. I think though that in the followup what yo would have to do is use a single epoch and just have it be preemptive in the RANDOM_LOADABLE case,

Hm, I'll give it a shot.

maybe you can just do that in this commit as well to appease any assertions that might fire?

Well, I won't commit this as-is if it trips WITNESS :-).

sys/dev/random/random_harvestq.c
249 ↗(On Diff #64694)
cem added inline comments.Nov 22 2019, 7:50 PM
sys/dev/random/random_harvestq.c
249 ↗(On Diff #64694)

Correction: D22504

cem added a comment.Nov 22 2019, 10:14 PM
The epoch API currently does not support sleeping in epoch_preempt
sections.

Is that bit still true? It seems to be the case:

_epoch_enter_preempt(epoch_t epoch, epoch_tracker_t et EPOCH_FILE_LINE)
{
...
    THREAD_NO_SLEEPING();

So I guess EPOCH_PREEMPT allows you to take ordinary (contested) mutexes and rwlocks and things like that, but not sxlocks or sleep(9). It looks like the sxlock recurse path shouldn't attempt to sleep, so long as there are no exclusive waiters. I guess we can't guarantee that, so there isn't a clear path forward here (for either proposal).

Maybe it would be useful to have a 3rd tier of epoch that allows sleeping, distinct from the preemptible class? This means the wait() and call() delays are potentially unbounded, but that's not a regression for an API already using sleepable locks. Much like we have a hierarchy today of sleepable -> "blockable" -> spin locks, maybe it makes sense to have a similar one in epoch(9)?

cem planned changes to this revision.Nov 23 2019, 12:22 AM

Pending D22512.

cem added a comment.Dec 26 2019, 7:51 PM

D22512 has landed and random_loadable no longer uses SX locks.

My current plan is to make the minor adjustment: SYSINIT(rs_epoch_init, SI_SUB_TASKQ + 1, => SI_SUB_EPOCH.

I am not sure if we need a pre-emptible epoch. random_harvest_direct acquires an ordinary (non-spin) mutex (Fortuna's global lock, which protects concurrent access to the entropy pools as well as the generator). Does that mean the epoch needs to be preemptible? (It is probably not suitable to use a spin mutex for Fortuna's global lock.)

cem updated this revision to Diff 66045.Dec 28 2019, 7:44 AM

Switch to preemptible epoch for algorithm locks (e.g., Fortuna's pool mutex).

markm accepted this revision.Dec 28 2019, 11:31 AM

I can't find anything to object to - but my epoch-fu is a bit lacking, so I'm trusting the other reviewers here.

This revision is now accepted and ready to land.Dec 28 2019, 11:31 AM
cem added a comment.Dec 28 2019, 6:06 PM

Thanks Mark! Epoch(9) is new to me too.

@markj , @jhb , does this revision make sense now? Hopefully the previous concerns about the config SX lock have been resolved externally by rS356096.

Fortuna's global mutex is a MTX_DEF, not a spin mutex, so I believe that means we need a PREEMPT epoch. I might be mistaken.

We can require that any Fortuna-alternative random(4) algorithm implementation (i.e., D22837) use only non-sleeping primitives.

I ran the current patch with EPOCH_TRACE enabled and did not see any warnings related to it. Quite a lot of Net epoch warnings, though.

markj accepted this revision.Dec 30 2019, 12:38 AM
This revision was automatically updated to reflect the committed changes.