Page MenuHomeFreeBSD

intrng: Shuffle unhandled interrupts too
ClosedPublic

Authored by cperciva on Sat, Feb 14, 12:46 AM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Feb 20, 2:55 PM
Unknown Object (File)
Wed, Feb 18, 2:15 AM
Unknown Object (File)
Mon, Feb 16, 1:02 PM
Unknown Object (File)
Mon, Feb 16, 11:11 AM
Unknown Object (File)
Sun, Feb 15, 11:18 PM
Unknown Object (File)
Sat, Feb 14, 9:02 PM
Unknown Object (File)
Sat, Feb 14, 9:07 AM
Unknown Object (File)
Sat, Feb 14, 8:03 AM
Subscribers

Details

Summary

When interrupt vectors are first allocated, they get assigned to
CPU #0; at SI_SUB_SMP / SI_ORDER_SECOND (aka once we have multiple
CPUs), the intr_irq_shuffle SYSINIT clears their CPU sets with the
effect of forcing them to be assigned to new CPUs later.

In case where interrupt vectors were allocated *but not yet bound*
this code did not run, with the effect that those interrupts would
remain pinned to CPU #0 forever. This affected the ena(4) driver,
which allocates interrupts for I/O when the device is attached but
doesn't set them up until the interface is brought up much later in
the boot process (and, crucially, long after intr_irq_shuffle runs).

Adjust intr_irq_shuffle to clear the CPU set for an interrupt source
even if it currently has no handlers, so that it will be properly
assigned to a CPU when it is used later.

MFC after: 1 month
Sponsored by: Amazon

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 70736
Build 67619: arc lint + arc unit

Event Timeline

I have tested in EC2 that this patch results in ena_handle_msix being called on 8 different CPUs (on this hardware we have 8 I/O queues) instead of only being called on CPU 0.

I put down a one-month MFC since this seems entirely MFCable from an interface-stability perspective but interrupt routing is the sort of thing which might take a while for problems to show up. But I'm hoping to land this in 15.1 and maybe also 14.5 (I'm not sure how much this code diverged between 14 and 15 so I'd be very happy to get feedback from experts about whether it makes sense to MFC this all the way to stable/14).

Looks good. The main thing is that we don't throw off the balance by including interrupt sources which exist but will definitely remain unused (the isrc == NULL case). Here the allocation is a good indicator that a driver is attached and it will/can be used.

I guess the alternative/redundant way to resolve this is for intr_setup_irq() to clear isrc->isrc_cpu when isrc->isrc_handlers == 0. This seems equally correct to me. Maybe an assertion added to that path would be revealing.

I don't think this code has diverged much in the lifetime since 14, and it will likely be safe for MFC.

This revision is now accepted and ready to land.Mon, Feb 16, 4:05 PM

Looks good. The main thing is that we don't throw off the balance by including interrupt sources which exist but will definitely remain unused (the isrc == NULL case). Here the allocation is a good indicator that a driver is attached and it will/can be used.

Ignore this part; what you've proposed doesn't affect the distribution at all.

I guess the alternative/redundant way to resolve this is for intr_setup_irq() to clear isrc->isrc_cpu when isrc->isrc_handlers == 0. This seems equally correct to me. Maybe an assertion added to that path would be revealing.

I wondered about that, but it's not quite the same -- one could imagine a driver calling bus_teardown_intr when an interface goes down. (ENA is oddly asymmetric here: It sets up I/O interrupts the first time the interface goes up, but doesn't tear them down until the device is going away.)

I opted for doing this in the shuffle code because this is a one-time "clear the temporary CPU assignment from pre-SMP boot" thing and if an interrupt gets repeatedly torn down and re-setup there's no need to do this again.

I don't think this code has diverged much in the lifetime since 14, and it will likely be safe for MFC.

Good to know, thanks! I'll aim to MFC this in time for 14.5 then.

Do we want to zero the CPU for PPIs & IPIs? I expect that check should be before the handlers == 0 check as they may be sent to all CPUs.

Do we want to zero the CPU for PPIs & IPIs? I expect that check should be before the handlers == 0 check as they may be sent to all CPUs.

Oh, I assumed that PPIs and IPIs always had handlers. But I guess even in that case the patch is cleaner if I move that check earlier. Update coming soon.

Don't CPU_ZERO for PPI/IPI with no handlers

I don't think this case ever happens, but the change is cleaner
this way anyway.

This revision now requires review to proceed.Tue, Feb 17, 6:54 PM
This revision is now accepted and ready to land.Wed, Feb 18, 12:45 PM
This revision was automatically updated to reflect the committed changes.