intrng: Shuffle unhandled interrupts too
ClosedPublic
Actions

Authored by cperciva on Sat, Feb 14, 12:46 AM.

Details

Reviewers

andrew
mhorne

Commits

rGb2ba4131b9b0: intrng: Shuffle unhandled interrupts too

Summary

When interrupt vectors are first allocated, they get assigned to
CPU #0; at SI_SUB_SMP / SI_ORDER_SECOND (aka once we have multiple
CPUs), the intr_irq_shuffle SYSINIT clears their CPU sets with the
effect of forcing them to be assigned to new CPUs later.

In case where interrupt vectors were allocated *but not yet bound*
this code did not run, with the effect that those interrupts would
remain pinned to CPU #0 forever. This affected the ena(4) driver,
which allocates interrupts for I/O when the device is attached but
doesn't set them up until the interface is brought up much later in
the boot process (and, crucially, long after intr_irq_shuffle runs).

Adjust intr_irq_shuffle to clear the CPU set for an interrupt source
even if it currently has no handlers, so that it will be properly
assigned to a CPU when it is used later.

MFC after: 1 month
Sponsored by: Amazon

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Skipped

Unit

Tests Skipped

Build Status

Buildable 70736
Build 67619: arc lint + arc unit

Event Timeline

cperciva created this revision.Sat, Feb 14, 12:46 AM

Herald added subscribers: olce, imp. · View Herald TranscriptSat, Feb 14, 12:46 AM

cperciva requested review of this revision.Sat, Feb 14, 12:46 AM

Harbormaster completed remote builds in B70672: Diff 171896.Sat, Feb 14, 12:46 AM

I have tested in EC2 that this patch results in ena_handle_msix being called on 8 different CPUs (on this hardware we have 8 I/O queues) instead of only being called on CPU 0.

I put down a one-month MFC since this seems entirely MFCable from an interface-stability perspective but interrupt routing is the sort of thing which might take a while for problems to show up. But I'm hoping to land this in 15.1 and maybe also 14.5 (I'm not sure how much this code diverged between 14 and 15 so I'd be very happy to get feedback from experts about whether it makes sense to MFC this all the way to stable/14).

Looks good. The main thing is that we don't throw off the balance by including interrupt sources which exist but will definitely remain unused (the isrc == NULL case). Here the allocation is a good indicator that a driver is attached and it will/can be used.

I guess the alternative/redundant way to resolve this is for intr_setup_irq() to clear isrc->isrc_cpu when isrc->isrc_handlers == 0. This seems equally correct to me. Maybe an assertion added to that path would be revealing.

I don't think this code has diverged much in the lifetime since 14, and it will likely be safe for MFC.

This revision is now accepted and ready to land.Mon, Feb 16, 4:05 PM

In D55284#1264897, @mhorne wrote:

Looks good. The main thing is that we don't throw off the balance by including interrupt sources which exist but will definitely remain unused (the isrc == NULL case). Here the allocation is a good indicator that a driver is attached and it will/can be used.

Ignore this part; what you've proposed doesn't affect the distribution at all.

In D55284#1264897, @mhorne wrote:

I guess the alternative/redundant way to resolve this is for intr_setup_irq() to clear isrc->isrc_cpu when isrc->isrc_handlers == 0. This seems equally correct to me. Maybe an assertion added to that path would be revealing.

I wondered about that, but it's not quite the same -- one could imagine a driver calling bus_teardown_intr when an interface goes down. (ENA is oddly asymmetric here: It sets up I/O interrupts the first time the interface goes up, but doesn't tear them down until the device is going away.)

I opted for doing this in the shuffle code because this is a one-time "clear the temporary CPU assignment from pre-SMP boot" thing and if an interrupt gets repeatedly torn down and re-setup there's no need to do this again.

I don't think this code has diverged much in the lifetime since 14, and it will likely be safe for MFC.

Good to know, thanks! I'll aim to MFC this in time for 14.5 then.

Do we want to zero the CPU for PPIs & IPIs? I expect that check should be before the handlers == 0 check as they may be sent to all CPUs.

In D55284#1265323, @andrew wrote:

Do we want to zero the CPU for PPIs & IPIs? I expect that check should be before the handlers == 0 check as they may be sent to all CPUs.

Oh, I assumed that PPIs and IPIs always had handlers. But I guess even in that case the patch is cleaner if I move that check earlier. Update coming soon.