Page MenuHomeFreeBSD

x86: handle domain with no CPUs usable for intr delivery
ClosedPublic

Authored by emaste on Aug 18 2023, 3:36 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, May 25, 3:21 AM
Unknown Object (File)
Sat, May 25, 3:21 AM
Unknown Object (File)
Sat, May 25, 3:21 AM
Unknown Object (File)
Sat, May 25, 3:21 AM
Unknown Object (File)
Tue, May 7, 7:18 PM
Unknown Object (File)
Mon, May 6, 10:41 PM
Unknown Object (File)
May 1 2024, 8:22 PM
Unknown Object (File)
May 1 2024, 8:22 PM
Subscribers

Details

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

emaste created this revision.
sys/x86/x86/intr_machdep.c
609

This would need a intr_single_domain check as in intr_next_cpu

tweak intr_init_cpus to avoid calling intr_next_cpu if we've set intr_single_domain

Feedback from markj in D41500

sys/x86/x86/intr_machdep.c
599

The commit is unnecessary now, will remove

emaste added a subscriber: cperciva.

Update failure message, from @cperciva

This is ok. A more generic solution perhaps would have been to have a

static int intr_domain[MAXMEMDOM];

And in intr_init_cpus, in the case where you set the intr_single_domain instead do something like:

if (CPU_OVERLAP(&cpuset_domain[i], &intr_cpus) == 0) {
   intr_domain[i] = 0;
   printf(...);
} else
   intr_domain[i] = i;

and then in intr_next_cpu

domain = intr_domain[domain];

where you are currently conditionally setting it to zero.

sys/x86/x86/intr_machdep.c
643

I don't quite see why you need this added condition? You're wanting to still choose a CPU from domain 0 not just any domain, or at least that's what the code implies? This seems to mean that we'll take any CPU from intr_cpus even if it's in another domain, but that doesn't match the notion of interrupts going to a "single" domain.

In practice I imagine the most common case is going to be two domains, with domain 1 having APIC IDs > 255 and thus unusable. I think your proposal is an improvement over D14500 indeed, and especially so if we had a system with 4 domains, 2 of which are usable. But overall I prefer the simplicity of this approach when systems where this change has any effect do not work at all today, and for code which will be OBE once we have an AMD IOMMU.

Maybe intr_no_domain is a better name for the flag. My intent is that if not all domains work for interrupts we just fall back to the pre- a48de40bcc09cd70632b72b486143a57ed25795b approach.

If we have only two domains then the intr_domain[i] = 0; approach is equivalent; if we have say 4 domains with 2 with APIC IDs < 256 then we probably want to map each non-working one to a unique working one.

Excerpts from boot log with this and a few other WIP and debugging patches:

intr_init_cpus: unable to route interrupts to CPUs in domain 1

nexus_bind_intr irq=0xfffff8013a7f8c80 isrc=0xfffff801190fec00 cpu=259
msi_map: unsupported destination APIC ID 259
nvme1: remap irq 386 to APIC ID 259 failed (error 22)
nvme1: bus_bind_intr to CPU 259 failed (22)
nexus_bind_intr irq=0xfffff8c076612980 isrc=0xfffff801190feb80 cpu=263
msi_map: unsupported destination APIC ID 263 
nvme1: remap irq 387 to APIC ID 263 failed (error 22)
nvme1: bus_bind_intr to CPU 263 failed (22)
nexus_bind_intr irq=0xfffff8c076689a80 isrc=0xfffff801190feb00 cpu=267
msi_map: unsupported destination APIC ID 267
nvme1: remap irq 388 to APIC ID 267 failed (error 22)
nvme1: bus_bind_intr to CPU 267 failed (22)
...

And smoke test to access the nvme devices from each CPU looks good.

#!/bin/sh

for cpu in $(seq 0 511); do
        for dev in nvd0 nvd1; do
                echo cpu $cpu dev $dev
                cpuset -l $cpu dd if=/dev/zero of=/dev/$dev bs=1M count=1
        done
done

Rename global to intr_no_domain to make intent clear

The new name is better. It might be nice to have some sign-posting here via an XXX comment that this can hopefully be reverted once we have AMD IOMMU support.

This revision is now accepted and ready to land.Aug 21 2023, 7:26 PM

It might be nice to have some sign-posting here via an XXX comment that this can hopefully be reverted once we have AMD IOMMU support.

Yeah, good idea. I'll add a comment, perhaps where intr_no_domain is declared explaining the reasoning.