Page MenuHomeFreeBSD

pci: Don't re-route legacy PCI on MSI/MSIX devs
Needs ReviewPublic

Authored by cperciva on Apr 12 2025, 12:46 AM.
Tags
None
Referenced Files
F133546876: D49792.id153558.diff
Sun, Oct 26, 1:55 PM
Unknown Object (File)
Thu, Oct 16, 5:23 PM
Unknown Object (File)
Thu, Oct 9, 3:45 PM
Unknown Object (File)
Fri, Oct 3, 10:43 AM
Unknown Object (File)
Tue, Sep 30, 3:07 AM
Unknown Object (File)
Aug 6 2025, 5:29 PM
Unknown Object (File)
Aug 4 2025, 10:05 AM
Unknown Object (File)
Jul 21 2025, 1:01 AM
Subscribers

Details

Summary

There is a (very historical) call to pci_assign_interrupt for the
purpose of routing PCI IRQs. On INTRNG systems this can cause a
(synthetic) IRQ leak and ultimately a kernel panic after many
hotplug/unplug cycles.

Since this should only be necessary on legacy PCI, disable it for
devices which use MSI/MSIX.

MFC after: 2 weeks
Sponsored by: Amazon

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 63472
Build 60356: arc lint + arc unit

Event Timeline

@jhb This avoids the IRQ leak on the EC2 instance I was having trouble with before. I think you suggested that this was only necessary for non-MSI/MSIX devices before I went down the (wrong) route of making it x86-only.

@jrtc27 @jhibbits Can you try this patch on your systems and let me know if it causes problems?

sys/dev/pci/pci.c
4178

The problem with this is that the driver might still choose to use INTx (or the admin might have disabled MSI via tunable, etc.) and since cfg->intline isn't invalid, we won't try to check it.

Maybe we need to encode this differently and only call pci_assign_interrupt() the first time pci_alloc_resource() tries to alloc the INTx interrupt. The IRQ in dmesg might still be wrong in this case though.

cperciva added inline comments.
sys/dev/pci/pci.c
4178

Oh, I assumed that systems were either MSI/MSIX or legacy... I didn't realize they might be both. Earlier commits in this section of code (e.g. when @bz added __PCI_REROUTE_INTERRUPT to arm64 in 3d9ac4ecfcba638fbbc9ec7925fea08206198deb he noted that this "was not needed on real hardware (yet) as it was always MSI".

At this point I'm wondering if I should just hide this behind a loader tunable, "hw.pci.no_legacy_reroute" since that way I can fix the panic on EC2 without potentially breaking weird systems.

sys/dev/pci/pci.c
4178

It's not even a hardwired thing. Drivers can choose which type of interrupt to use at runtime. Legacy PCI only supported INTx originally, so many/most devices still support those. MSI is mandated by PCI-express, so most modern devices support both. However, some earlier PCI-express devices had bugs and require INTx instead of MSI. Generally speaking newer devices should be using MSI since it is more flexible and more performant, but INTx is still around.

sys/dev/pci/pci.c
4178

Got it. You ok with my tunable workaround? https://reviews.freebsd.org/D49849