Page MenuHomeFreeBSD

rms-protected event handlers
Changes PlannedPublic

Authored by mjg on Tue, Nov 17, 3:40 PM.

Details

Reviewers
kib
markj
Summary

The current eventhandler code does not scale whatsoever, relocking the list for each callback. This shows up on lockprofile as process_exec, thread_dtor/ctor and other locks. This also top of the profile on a kernel with other changes when creating/destroying threads in a loop. Naive conversion to just rms locks does not work as machinery supports ideas like unregistering a callback by itself, meaning you can't hold rms across the entire thing.

Because of that I implement a new API which scales, is faster single-threaded and less taxing on caches. It retains the "priority" argument as apparently callers need it. One potential regression is that a handler which blocks indefinitely will also block modifications to the list, I don't think that's a problem worth addressing for the foreseable future. It can be done with more complexity.
The goal would be to replace all eventhandlers which are known to be there at compilation time with the new API. This will be separate.

The patch, modulo bugs to be shaken out and perhaps some cleanups is what I intend to go forward with.

thread handlers are converted here https://people.freebsd.org/~mjg/thread-eventhandler.diff

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

mjg requested review of this revision.Tue, Nov 17, 3:40 PM
sys/sys/_eventhandler.h
80

I don't know what to do with this mess. Moving rmslock to a separate file would only temporarily take care of the problem. For example addition of bitmaps would pull both param.h and _cpuset.h

mjg retitled this revision from [semi-wip] rms-protected event handlers to rms-protected event handlers.
mjg edited the summary of this revision. (Show Details)
mjg removed a reviewer: gnn.
  • some cleanups
  • split actual users away

So these 'static' handlers cannot sleep, am I right ? For instance, exec event cannot be converted to the new KPI.
Also, with this addition, we get approximately 2.5 events KPI in kernel.

That said, some actions done through event handlers seems to be excessive. I do not see a reason why e.g. umtx exec hook needs to be event hook and not a direct call from exec, umtx cannot be compiled as a module.

Perhaps a review of the current uses of eventhandler infrastructure in kernel before further work there would benefit the state of the code.

sys/kern/subr_eventhandler.c
399

Why the tautological cast ?

426

Should this be under invariants ?

In D27252#609719, @kib wrote:

So these 'static' handlers cannot sleep, am I right ? For instance, exec event cannot be converted to the new KPI.

They can sleep. The difference compared to the current API is that adding or removing new callbacks wont be possible during that time. This can be fixed but I don't think it is worth the effort.

Also, with this addition, we get approximately 2.5 events KPI in kernel.

The EVENTHANDLER API should be removed, it's just a lot of work.

> That said, some actions done through event handlers seems to be excessive. I do not see a reason why e.g. umtx exec hook needs to be event hook and not a direct call from exec, umtx cannot be compiled as a module.

Perhaps a review of the current uses of eventhandler infrastructure in kernel before further work there would benefit the state of the code.

I did review them. The above API is bare minimum which facilitates converting the majority without significant rework. Things would be better if "thread_dtor" and similar were instead an object-specific list of callbacks so that the rest does not have to branch on them. However, there are still consumers which want to inspect all threads which get created and for that the proposed API or similar can't be avoided.

One of the design bugs here is the "priority" thing where some callbacks register as "first" or "last", but there is more than one for the same list. The 'static' keyword concerns the list itself. EVENTHANDLER has lists identified with strings which get strcmped against to find the actual list. And so on.

sys/kern/subr_eventhandler.c
399

missed cleanup.

426

This is executing rarely enough that imo should stay until the dust settles down.

In D27252#609865, @mjg wrote:
In D27252#609719, @kib wrote:

So these 'static' handlers cannot sleep, am I right ? For instance, exec event cannot be converted to the new KPI.

They can sleep. The difference compared to the current API is that adding or removing new callbacks wont be possible during that time. This can be fixed but I don't think it is worth the effort.

These are rms. Please update locking(9).

Also, with this addition, we get approximately 2.5 events KPI in kernel.

The EVENTHANDLER API should be removed, it's just a lot of work.

> That said, some actions done through event handlers seems to be excessive. I do not see a reason why e.g. umtx exec hook needs to be event hook and not a direct call from exec, umtx cannot be compiled as a module.

Perhaps a review of the current uses of eventhandler infrastructure in kernel before further work there would benefit the state of the code.

I did review them. The above API is bare minimum which facilitates converting the majority without significant rework.

My point is that we should not introduce double of existing core KPI without firm plan to leave only one in very short time frame.

Things would be better if "thread_dtor" and similar were instead an object-specific list of callbacks so that the rest does not have to branch on them. However, there are still consumers which want to inspect all threads which get created and for that the proposed API or similar can't be avoided.

One of the design bugs here is the "priority" thing where some callbacks register as "first" or "last", but there is more than one for the same list.

You mean that the same callback is registered more than once with different priorities ? Can you point out examples ? It should be easy to work around in consumers.

The 'static' keyword concerns the list itself. EVENTHANDLER has lists identified with strings which get strcmped against to find the actual list. And so on.

That was about DIRECT (0.5 of the KPI in the reference above) ?

In D27252#609954, @kib wrote:
In D27252#609865, @mjg wrote:
In D27252#609719, @kib wrote:

So these 'static' handlers cannot sleep, am I right ? For instance, exec event cannot be converted to the new KPI.

They can sleep. The difference compared to the current API is that adding or removing new callbacks wont be possible during that time. This can be fixed but I don't think it is worth the effort.

These are rms. Please update locking(9).

Also, with this addition, we get approximately 2.5 events KPI in kernel.

The EVENTHANDLER API should be removed, it's just a lot of work.

> That said, some actions done through event handlers seems to be excessive. I do not see a reason why e.g. umtx exec hook needs to be event hook and not a direct call from exec, umtx cannot be compiled as a module.

Perhaps a review of the current uses of eventhandler infrastructure in kernel before further work there would benefit the state of the code.

I did review them. The above API is bare minimum which facilitates converting the majority without significant rework.

My point is that we should not introduce double of existing core KPI without firm plan to leave only one in very short time frame.

The plan is to remove all eventhandler lists which use DIRECT_INVOKE. What might turn out to be problematic is the current feature where you can register a callback to a list which does not exist -- then a new one gets allocated. If converting consumers of the feature is a problem the old API may need to stay for the foreseeable future.

Or to put differently, almost everyone will move to the new API within weeks, it's just manual work. The rest will likely move as well and if so the current will be removed.

Things would be better if "thread_dtor" and similar were instead an object-specific list of callbacks so that the rest does not have to branch on them. However, there are still consumers which want to inspect all threads which get created and for that the proposed API or similar can't be avoided.

One of the design bugs here is the "priority" thing where some callbacks register as "first" or "last", but there is more than one for the same list.

You mean that the same callback is registered more than once with different priorities ? Can you point out examples ? It should be easy to work around in consumers.

I mean things like this:

geom/mirror/g_mirror.c: g_mirror_post_sync = EVENTHANDLER_REGISTER(shutdown_post_sync,
geom/mirror/g_mirror.c: g_mirror_shutdown_post_sync, mp, SHUTDOWN_PRI_FIRST);

geom/raid/g_raid.c: g_raid_post_sync = EVENTHANDLER_REGISTER(shutdown_post_sync,
geom/raid/g_raid.c: g_raid_shutdown_post_sync, mp, SHUTDOWN_PRI_FIRST);

And so on. I don't think reworking all consumers is worth the effort right now.

That is, in my opinion the best way forward for now is to move the old API out of the way in easiest way possible -- this solves the immediate performance problem and lets fix the above later in whatever manner.

The 'static' keyword concerns the list itself. EVENTHANDLER has lists identified with strings which get strcmped against to find the actual list. And so on.

That was about DIRECT (0.5 of the KPI in the reference above) ?

mjg planned changes to this revision.Tue, Nov 24, 12:20 AM

I'll take a closer look at turning this into a complete replacement for the current code.

sys/kern/subr_eventhandler.c
375

Wrong indentation.

382

The namespace is still "eventhandler" so this should be called eventhandler_init_static() or so.

490

In the description you said that eventhandlers might deregister themselves. Where does it happen in practice?