The architecture nexus should handle allocation and release of memory and
interrupts. These need to be visible to the rest of the kernel and not
hidden in the OpenFirmware code.
This is a very early tentative proposal. The issue is without this resource allocations which get to the nexus may be given ranges overlapping with what the OpenFirmware code has. Two drivers trying to use the same physical address range will be Bad(tm).
Looks like every variant of nexus.c except for PPC is already prepared to handle memory/interrupt allocations and therefore the correct solution is to simply forward all allocations to the nexus. There is likely a need to introduce similar code in sys/powerpc/powerpc/nexus.c.
Trick would then be every variant of sys/*/*/nexus.c would have mostly identical code for handling these allocations. Time to introduce sys/dev/bus/root.c? Difficulty here is there is no architecture-independent function for retrieving the highest physical address (for mem_rman.rm_end).
This is needed for the attempt at getting Xen operational on ARM. When running on Xen there is a need to allocate physical addresses and tell the hypervisor to map virtual devices there. If both nexus.c and ofwbus.c only have part of the story of which physical addresses are used, trouble will ensue.
To state the problem as I understand it: ofwbus fails to reserve or report its resource usage back to nexus, violating an assumption of newbus. This is of no consequence when all devices allocate resources through ofwbus, but becomes an issue when a child of nexus needs to allocate an unused resource region (this will be the case for Xen's PV bus on arm64).
As you point out, simply removing these routines is an insufficient solution on powerpc; instead, they should probably be moved back into powerpc's nexus.c. Other architectures are set up to handle allocations from nexus already, but we would need verify that things still behave properly on each platform if this change is applied.
Tagging @nwhitehorn, who might have input on this.
@mhorne, you got it exactly right. Issue is ofwbus is keeping everything to itself, which is fine if *everything* goes through ofwbus, but a major problem for anything independent of ofwbus (such as Xen PV).
I hadn't tracked down when it happened, so that is interesting to see it originated on PPC. Thing is what was being done was almost right. Quite a bit of sys/*/*/nexus.c is common code which should be merged together, but it needs to move to a nexus_common.c or root.c which is $somewhere (sys/dev? sys/kern?).
Updating with a potential move to sys/powerpc/powerpc/nexus.c. I've tried to move this a bit closer to the other nexuses. Most others use static variables in the file, rather than the softc. I've also opted to limit the regions for the resource managers since the resource manager code actually breaks when you use the 0 to ~0 range.
Okay, I was able to boot this on a few devices for arm, arm64, and riscv.
For arm, I needed to add an intr_rman and its handling to its nexus bus (see D32357).
Otherwise, devices attached fine and I was able to see that the expected changes in the output of devinfo -u; that is, the same resources allocated from ofwbus are now allocated from nexus. Check out P517 and P518.
I was unable to test mips, mainly since the only platform I have access to (MALTA64 in QEMU) does not use FDT. Given the state of the port and its planned removal from CURRENT, this shouldn't be a blocker.
Someone with PowerPC hardware will need to do the same testing, ideally on both 32 and 64-bit.
Please do not change the region boundaries, we want to maintain the same behaviour before and after moving the code around.
Moving this out of ofwbus means that arm, arm64, mips, and riscv are now without a bus_adjust_resource method. I think this is okay, it is not used very widely and arm64 with ACPI has always been without it. Still, it will be worth noting in the final commit message, as this could be a regression.
I've run into some an additional oddity with the Open Firmware core. Only allocated resources are visible in devinfo.
The device-tree provided by Xen (see: D29875) provides an interrupt and a memory range. The memory range is completely empty, but is provided as a suggested range for mapping the Xen grant table. This range is completely invisible unless Xen's device-tree code allocates it.
Each time I run across the default range I hear klaxons screaming "uninitialized! uninitialized! uninitialized!" (new form of Dalek?). I really dislike leaving this alone, even if that ends up being a separate diff.
Huh. I hadn't noticed any issues, but that is certainly worthy of mention.
Well, they are not uninitialized, but initialized to the default/maximum values. Now that the overflow in rman_reserve_resource_bound() has been fixed we shouldn't fear using this range, it is meant to be supported.