Preamble
This change matters only for shared interrupts and I realize that those
are becoming a thing of the past (and quickly). I also understand that
the problem that I am trying to solve is extremely rare. Nevertheless,
I would like to make the code a bit more complex and a little bit safer.
The change
This is a fix for https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=229106
plus some changes.
The idea of the fix is somewhat similar to the idea of the epoch
based reclamation. Atomic operations and memory fences are used to
ensure that ie_handlers list is always safe to navigate. A new element
is fully set up before getting linked into the list. And a removed
element is not reclaimed until we know that the list is not being
iterated.
There are some simplifications comparing to the general epoch based
reclamation. All writers are serialized using a mutex, so we do not
need to worry about concurrent modifications. Also, all read accesses
from the open context are serialized too. So, there can only be a
single unsynchronized reader at a time. It's either the code running in
the ISR context or an interrupt thread. Thus, instead of epoch numbers
it is possible to simply mark the start and end of ie_handlers
iteration.
Some details.
I changed ie_handlers from TAILQ to SLIST for two reasons:
- SLIST is easy to make always consistent for lock-less iteration
- SLIST is sufficient for all required operations given that its modifications are not performance critical
I use a previously unused SLIST_FOREACH_PREVPTR to search for a place
where to insert or remove an element. I do not use any SLIST_INSERT or
SLIST_REMOVE variants, but roll out their equivalents necessary to
ensure atomic properties. That somewhat breaks SLIST encapsulation.
It also reinvents CK_SLIST a little bit. I felt that using CK_SLIST
when most of accesses are done under a lock is an unneeded overhead.
I actually use two different mechanisms to wait for a completion of the
unsynchronized access. The ISR context atomically increments and
decrements 'running' indicator and the removals spin until it becomes
zero. The assumption is that interrupt filters are very fast.
The ithread code uses a similar but a different indicator. That
indicator is set and cleared under a spinlock. The removals also check
the indicator under the spinlock. If it's not zero, then the code
sets a special 'waiting' flag and uses msleep(9) to wait. The ithread
code wakes up the waiter(s) when clearing the running indicator. I did
it this way, because I assumed that the ithread processing can take much
longer than the filters, so it is not practical to spin for all that
time. But maybe a regular mutex would serve here better than the sleep
plus wakeup.
I also changed _intr_drain() that is used by LinuxKPI to do the same
kind of waiting as the interrupt handler removal does.
I removed a check for 'cold'. I believe that msleep() should work fine
in all contexts where intr_event_remove_handler could be called.
I made intr_event_execute_handlers() static because it's not used
outside of kern_intr.c. So, I decided to "appropriate" it.
Also, previously intr_event_update() was not called if a filter-only
intr_hanlder was removed.
A couple of white space fixes.