There is potential for non-root PICs to need per-processor
initialization. Few root PICs try to propogate the call. As such add
a pass of calling pic_init_secondary() on all children with a non-root
value.
Details
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 55829 Build 52718: arc lint + arc unit
Event Timeline
Check the build and naturally find oversights from the quick implementation. Tiny fix. One item is whether the pic_list_lock should in fact be held...
I'm not the right person to review this. The review description also makes it hard to understand whether the change has any functional impact or not: is this a bug fix, or a simplification for PIC drivers?
In case anyone is wondering, to answer the comment "QQQ: Only root PIC is aware of other CPUs ???". That is indeed the way this was implemented. The more important issue is are PICs besides the root PIC interested in being informed when processors are brought on-line.
In at least one case the answer is a resounding "yes". I'm working with a soft-PIC which connects via a single PPI to a GIC and urgently needs to do setup as processors are brought on-line. I suspect the Hyper-V driver might also want to take advantage of this.
One weakness is due to pic_list being an SLIST, secondary PICs will end up having their functions called first.
It is notable few PIC drivers implement a PIC_INIT_SECONDARY() hook. Specifically sys/arm/arm/gic.c, sys/arm64/arm64/gic_v3.c and sys/arm/broadcom/bcm2835/bcm2836.c are the only in-service implementations.
I tested and locking/scheduler aren't fully working at this stage so pic_list_lock cannot be held.
Pulling the intr_irq_root_dev == NULL check. There are similar checks elsewhere and that won't directly effect this spot.
I'm unsure whether the call to child devices during gic_v3_init_secondary() should be removed. The only user of this is the GIVv3 ITS (550d01a2117f). Will most child devices need this called after the parent PIC is fully initialized? Do child devices merely need this called during processor bring-up?
The device I'm looking at, the device doesn't get added as a child. Simply manifests as a device connected to a GIC via a PPI interrupt and needs a per-processor call done at some point. Doesn't matter whether the call is done done before or after the GIC driver is fully initialized, merely needs to be called soon after each processor is started (xen_setup_vcpu_info()).
Given the number of uses this looks almost untested. The interface seems to be a wild guess about what devices will need. I'm interested in adjusting since this is precisely the hook I need, just this device won't end up in ->gic_children.
Whilst this looks like the right thing to do for the core framework, I believe our GICv2 driver won't like this in some edge cases, as for the FDT case it supports being a non-root PIC yet has an arm_gic_init_secondary which currently only runs for the root PIC, probably by design?
sys/arm/mv/mpic.c overrides it and isn't a root PIC, but the implementation is empty so that's harmless.
This will iterate in the reverse order in which the PICs were registered (since they're inserted at the head), which is probably the exact opposite of what you want, and not the current behaviour for GICv3, but that may or may not matter in that specific case.
This also will call pic_init_secondary for normal PICs and MSI PICs, so will call it twice for things that are both, e.g. GICv3 again (probably won't matter for things that are just MSI PICs, as those are unlikely to have an implementation, and if they do it's reasonable to expect it to be called).
You want to have if (dev != intr_irq_root_dev) return; added to arm_gic_init_secondary() as part of this? I don't have any way to test this.
sys/arm/mv/mpic.c overrides it and isn't a root PIC, but the implementation is empty so that's harmless.
In fact D40475 is to remove this.
This will iterate in the reverse order in which the PICs were registered (since they're inserted at the head), which is probably the exact opposite of what you want, and not the current behaviour for GICv3, but that may or may not matter in that specific case.
I'm aware. This could be addressed by changing pic_list to a STAILQ, but the single instance where this was needed it doesn't matter. The (non-root!) PIC in question needs per-processor initialization soon after each processor starts, and doesn't care whether the root PIC has been initialized or not. Given this it seemed best to leave this alone until something came along which did care (or a reviewer decides this is important).
This also will call pic_init_secondary for normal PICs and MSI PICs, so will call it twice for things that are both, e.g. GICv3 again (probably won't matter for things that are just MSI PICs, as those are unlikely to have an implementation, and if they do it's reasonable to expect it to be called).
You want if (dev != intr_irq_root_dev) return; added to every preexisting pic_init_secondary() implementation? Luckily this is simple as there are very few implementation (bcm2836 is the only one which wasn't mentioned).
You want if (dev != intr_irq_root_dev) return; added to every preexisting pic_init_secondary() implementation? Luckily this is simple as there are very few implementation (bcm2836 is the only one which wasn't mentioned).
That's neither necessary nor sufficient, so no.
I think it's the wrong way. The initiation of child PICs is the job of the parent PIC and must be done in right order and in coordination with the parent PIC. Please see D43452 for more details.
This is a problem with hierarchy of interrupts controllers in RISCV. So the fix is not only for a single instance. Logically, the root interrupt controller's init should be called first and then the secondary one's. The secondary controller may have a dependency on the root controller. Using STAILQ is better approach that would solve the problem for the two architectures.
This also will call pic_init_secondary for normal PICs and MSI PICs, so will call it twice for things that are both, e.g. GICv3 again (probably won't matter for things that are just MSI PICs, as those are unlikely to have an implementation, and if they do it's reasonable to expect it to be called).
You want if (dev != intr_irq_root_dev) return; added to every preexisting pic_init_secondary() implementation? Luckily this is simple as there are very few implementation (bcm2836 is the only one which wasn't mentioned).
Okay, what approach do you suggest?
Presently it is difficult to identify the correct action since there are only 4 implementations of the function. intr_pic_init_secondary() only calls pic_init_secondary() on the root PIC. There are 4 implementations of pic_init_secondary(): sys/arm/broadcom/bcm2835/bcm2836.c, sys/arm/arm/gic.c, sys/arm64/arm64/gic_v3.c and sys/arm64/arm64/gicv3_its.c. Of these, only git_v3_init_secondary() (sys/arm64/arm64/gic_v3.c) attempts to propagates the call.
I was guessing actual multi-PIC systems were fairly rare.
That may be true for some/most PICs, but I'm looking at one for which this isn't true. The particular PIC is allocated a single PPI interrupt on the root PIC and isn't presented as a child of the root PIC. Its ordering requirement is it needs (fairly expensive) per-processor initialization done after each processor starts and doesn't care whether any other PIC's initialization has been done.
Are there any devices for which changing pic_list from a SLIST to a STAILQ would fail to generate correct results? If not then the only issue is what adjustment the existing pic_init_secondary() implementations need.
@jrtc27 I was hoping for an answer to my question before doing anything.
I lack appropriate hardware to confirm this, but I suspect gic_v3_init_secondary() was already being called a second time for systems with both a normal PIC and a MSI PIC. As long as both were GICv3 implementations and one was a direct child of the other, the loop at the end of gic_v3_init_secondary() was doing this.
The only remaining issue is the other pic_init_secondary() implementations which seem unclear whether multiple PIC systems were ever considered and the correct behavior for these.
As expected, SLIST->STAILQ made no difference to my test system, but hopefully works for other devices.
Why would it call it a second time? There's only one newbus device, but multiple PICs for that device, one per interrupt type. That's the point; device-to-PIC is a one-to-many relationship, it's only one-to-one when you factor in the type too on the left.
The only remaining issue is the other pic_init_secondary() implementations which seem unclear whether multiple PIC systems were ever considered and the correct behavior for these.
No it's not. It's still not guaranteed to match the actual topology AFAIK.
@jrtc27 not being on intimate terms with the GICv3 implementation and lacking hardware to confirm how things work, I've got no idea what to do. What I do know is the extra PIC for this device never gets into sc->gic_children and therefore the loop in gic_v3_init_secondary() doesn't work for this device.
*sigh* please read what I said. PIC != newbus device. A PIC contains a reference to the newbus device that created it, but there can be multiple PICs for the same newbus device. Since you iterate over the list of PICs you will get those duplicates. The code in question, which isn't hard to find:
$ grep -nC3 intr_'\(pic\|msi\)'_register sys/arm64/arm64/gic_v3*.c sys/arm64/arm64/gic_v3_acpi.c-327- if (err != 0) sys/arm64/arm64/gic_v3_acpi.c-328- goto error; sys/arm64/arm64/gic_v3_acpi.c-329- sys/arm64/arm64/gic_v3_acpi.c:330: sc->gic_pic = intr_pic_register(dev, ACPI_INTR_XREF); sys/arm64/arm64/gic_v3_acpi.c-331- if (sc->gic_pic == NULL) { sys/arm64/arm64/gic_v3_acpi.c-332- device_printf(dev, "could not register PIC\n"); sys/arm64/arm64/gic_v3_acpi.c-333- err = ENXIO; -- sys/arm64/arm64/gic_v3_acpi.c-338- * required for Hyper-V GIC to work in ARM64. sys/arm64/arm64/gic_v3_acpi.c-339- */ sys/arm64/arm64/gic_v3_acpi.c-340- if (vm_guest == VM_GUEST_HV) { sys/arm64/arm64/gic_v3_acpi.c:341: err = intr_msi_register(dev, ACPI_MSI_XREF); sys/arm64/arm64/gic_v3_acpi.c-342- if (err) { sys/arm64/arm64/gic_v3_acpi.c-343- device_printf(dev, "could not register MSI\n"); sys/arm64/arm64/gic_v3_acpi.c-344- goto error; -- sys/arm64/arm64/gic_v3_fdt.c-148- goto error; sys/arm64/arm64/gic_v3_fdt.c-149- sys/arm64/arm64/gic_v3_fdt.c-150- xref = OF_xref_from_node(ofw_bus_get_node(dev)); sys/arm64/arm64/gic_v3_fdt.c:151: sc->gic_pic = intr_pic_register(dev, xref); sys/arm64/arm64/gic_v3_fdt.c-152- if (sc->gic_pic == NULL) { sys/arm64/arm64/gic_v3_fdt.c-153- device_printf(dev, "could not register PIC\n"); sys/arm64/arm64/gic_v3_fdt.c-154- err = ENXIO; -- sys/arm64/arm64/gic_v3_fdt.c-156- } sys/arm64/arm64/gic_v3_fdt.c-157- sys/arm64/arm64/gic_v3_fdt.c-158- if (sc->gic_mbi_start > 0) sys/arm64/arm64/gic_v3_fdt.c:159: intr_msi_register(dev, xref); sys/arm64/arm64/gic_v3_fdt.c-160- sys/arm64/arm64/gic_v3_fdt.c-161- /* Register xref */ sys/arm64/arm64/gic_v3_fdt.c-162- OF_device_register_xref(xref, dev);
Two PICs, one device. Now, gic_children deals with devices, not PICs, and so it does not have this duplication. Which is good, because we don't want to call gic_v3_init_secondary twice on the same device. Note the important fact that, once again, we're dealing with newbus devices, not PICs, so the fact there are two PICs doesn't mean we should call it twice. So gic_v3_init_secondary does work as intended today. The problem (in this instance) is this code is duplicating calls.
I'm really struggling to understand what the source of confusion here is.
I hadn't noticed intr_msi_register()->pic_create() and intr_pic_register()->pic_create(), so those create two genuinely distinct entries on pic_list. The comment in gic_v3_acpi_attach() leaves me suspecting ACPI doesn't have the concept of a distinct PIC for MSI interrupts, which in turn makes me suspect...
Since every use calls intr_pic_register() and intr_msi_register() is optional, my first thought is to simply test the type and only call in case of == FLAG_PIC. Would you find that acceptable?
I've written this several times in other reviews, I think. Initialization of PICs on secondary cores must be coordinated with the parent PIC and must follow the PIC hierarchy. Imho, the only way to do this correctly is implement this in pic_init_secondary() for each PIC driver.
BTW - most real systems have multiple PIC instances - GICv3 has multiple functions/departments, each of which is a separate PIC, each PCI driver should contain a PIC. And it's not uncommon for a hierarchy to have 3 or 4 nodes.
Kind of overdue for action. I had already been moving discussion elsewhere.
So tens? Perhaps hundreds? Of other developers have been suggesting similar things, yet you have ignored this chorus telling you the existing approach is inadequate.
True, but the existing approach has a long list of problematic requirements:
- All PICs must implement pic_init_secondary() handlers. Searching only found them in: sys/arm/arm/gic.c, sys/arm64/arm64/gic_v3.c, sys/arm64/arm64/gicv3_its.c, and sys/riscv/riscv/intc.c.
- All pic_init_secondary() handlers must call PIC_INIT_SECONDARY() on all children. Of the mere 4 existing implementations, only sys/arm64/arm64/gic_v3.c does this.
- All PICs in the device tree must be presented in the PIC hierarchy.
I've seen comments in source being worried about the root PIC not being enumerated first, yet has this ever occurred in the Real World? Replacing the PIC list with a STAILQ and relying on PICs being enumerated from the root is a heuristic, but it is likely to match most (if not all) Real World use cases. @himanshu_thechauhan.dev's comment suggested this approach should handle their situation. So perhaps this fails in theory, but provides a superior engineering trade-off?
Sure. Yet the rarity of pic_init_secondary() being implemented suggests the successful uses cases are more notable than the failures. I haven't done much with device-trees, but concern #3 seems rather severe for peripheral bus PICs which are remote from the root. Mainly their only presentation in the PIC hierarchy is being connected via PPI.
The device-tree I'm looking at has been in deployment for 5-10+ years. Linux has a driver, so it seems accepted.
One other sort-of workable approach would be add a requirement for pic_init_secondary() implementations to be safe for being call multiple times. Then have intr_pic_init_secondary() loop through pic_list after calling PIC_INIT_SECONDARY(intr_irq_root_dev);.
I'm unsure whether the conversion to STAILQ would then be necessary. The PIC I'm looking at the function is already safe (the function might be expensive to call multiple times, but it wouldn't cause problems).
Once the fixes to the multi-root support are in, using those will address my need. Unfortunately the problem brought up by the comment. "QQQ: Only root PIC is aware of other CPUs ???", still remains. This is no longer productive for me, so abandoning this one.
Reclaiming/reopening as per note in D47279. This could be a better solution to the issue pointed at.