The patch utilizes VT-d interrupt remapping block (IR) to perform FSB messages translation. One consequence is that even despite io apics only take 8bit apic id, IR translation structures accept 32bit apic id, which allows x2apic mode to function properly.
For each msi and ioapic interrupt source, the iommu cookie is added, which is in fact index of the IRE (interrupt remap entry) in the IR table. Cookie is allocated at the source allocation time, and then used at the map time to fill both IRE and device registers. The MSI address/data registers and ioapic redirection registers are programmed with the special values which are recognized by IR and used to restore the IRE index, to find proper delivery mode and target. I have to map all MSI interrupts in the block when msi_map() is called.
KPI of IR is isolated into the x86/iommu/iommu_intrmap.h, to avoid bringing all dmar headers into interrupt code. The non-PCI(e) devices which generate message interrupts on FSB require special handling. The HPET FSB interrupts are remapped, while DMAR interrupts are not. I need to detect device class of the outstanders to correctly handle originators and mapping, this is why hpet and dmar device classes become global vars.
The problem with the patch is that interrupt source setup and dismantle code are done in the non-sleepable context. Sometimes the context even disallows blocking (why icu_lock is spinlock ?). Even if I manage to ensure that no memory allocation failure happens, I still need to flush interrupt entries cache in the IR hardware, which is done async and ideally waits for the interrupt. Instead, I busy-wait for queue to drain.
Another issue which I see is with intr_shuffle_irqs() is non-atomic. When bus_remap_intr() is called, the devices reload MSI/MSIX(-like) config registers, and must disable interrupts around writes, since more that one reg is written. This could be eliminated at least on amd64, since remapping effectively only changes IR table, MSI config is constant unless iommu interrupt cookie is changed. We could eliminate the calls to bus_remap_intr() at all, but this is currently not done.