I have pretty mixed feelings about this approach. It only works for ofwbus children, not simplebus children, and canonicalizes a bunch of behaviors that I don't believe are standards (I'm in an airport departure lounge and don't have the spec handy). I'm also not really sure how it interacts with multipass etc. Could you elaborate a little more on the mechanism?
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Sep 7 2019
Aug 27 2019
Aug 26 2019
Fix fat-fginering; this is the right diff. It also sets Python to use Python 2, since a handful of scripts cared.
Updates from review.
Aug 25 2019
Jul 21 2019
OK, sounds good, thanks for testing!
Jul 15 2019
Just make the page table size really small. There's a spill counter in the statistics; when it gets small enough, you should see it moving. If spill-handling doesn't work, and you have spills, everything breaks fast.
Does this break PTE spills? This code path historically was involved in re-adding PVOs to the hashed page table on faults if they got spilled.
Jul 11 2019
That's a fair point. I've used them to assess hash-table spill rates when performance is not great, but not for a long time. We could also just delete them.
Yeah, counter(9) seems like the actual solution here.
Jun 19 2019
Looks good to me; do you know how much this actually helps?
Jun 17 2019
Jun 16 2019
Jun 11 2019
Feb 26 2019
Should we panic if freeze_timebase somehow doesn't get set? Or, alternately, fall back on the old lame thing?
Feb 10 2019
Jan 13 2019
Longer-term, it might be better to avoid a proliferation of special cases like this to set LOADER_MSDOS_SUPPORT for the U-boot loader and make loader flexible enough to find the kernel etc. in the root rather than /boot. But this is fine for now.
Oct 19 2018
Oct 11 2018
Aside from one inline note, this looks OK for non-SPE.
Oct 5 2018
Sep 19 2018
Sep 17 2018
Can we make the check the presence of a PHB4 node in the device tree rather than an #ifdef? It might also be best to do this in platform_powernv.c, but that's a bit larger of a project.
Aug 30 2018
In D16794#361563, @jhb wrote:On closer examination, the loader doesn't actually modify the kernel or modules in place at all. Instead, this routine is used to apply relocations on copies of data. So for example, the code to parse module metadata copies individual records out of the kernel into a local variable on the stack, then asks the relocation to be applied to that local copy. This is a really good reason to honor the the 'data' and 'len' bounds as otherwise it allows writing to random crap on the stack. Also, we do explicitly ignore all relocations for the kernel itself, so it must be some relocation in tcp_rack.ko that wasn't handled before that is being handled now perhaps. I would have expected that to trigger a warning if so in the existing code though for an "unhandled relocation type".
Aug 29 2018
In D16794#361523, @jhb wrote:I believe since ppc is using Elf_Rela and that all of the relocations calculated absolute values to store at *where or *hwhere (vs using += or the like), the loader relocations just get overwritten by the kernel. I think when using Elf_Rel you might use the original value of *where as the addend in which case multiple passes of relocations is indeed destructive, but I don't think it is for Elf_Rela.
Aug 24 2018
Aug 19 2018
I just looked at the actual patch. The kernel sometimes (almost always on ppc64, less often on ppc32) relocates itself again post-loader to an address that loader does not know, so I think this does not solve the actual problem, or at least solves it in only a subset of cases. Processing these relocations in the kernel would solve the issue in all cases.
In D16794#357383, @jhb wrote:I added Justin and Nathan (who I meant to include from the start).
I believe PPC kernels are relocatable. I think they use ET_DYN instead of ET_EXEC (and there is a pending PR to fix that issue in libkvm as libkvm doesn't work on it currently as it only expects ET_EXEC for a kernel). Justin and/or Nathan can confirm.
Aug 11 2018
Looks OK to me. One question: Do we want the various instances of fdt_check_header() in our code to be spelled FDT_RO_PROBE() now?
Jul 31 2018
Jul 19 2018
Since I'm not familiar with the clock code, where is EXT_RESOURCES defined? Can we just use that code unconditionally?
Jun 29 2018
Assuming you have tested this and it works, looks great.
Jun 21 2018
Looks good to me.
Jun 20 2018
Approved if you fix the #if 0 bit.
Jun 15 2018
Don't you want strcmp() == 0?
It's mildly stupid, since the universe of pci* isn't big, but I think I would prefer a check for "pci" || "pciex" to just checking the first three characters.
Jun 11 2018
In D15229#333064, @imp wrote:In D15229#333057, @nwhitehorn wrote:In D15229#333038, @imp wrote:The final option would be to do a variation on the prior option, but institutionalize it in DEVICE_ATTACH_LATE. It would be called just after interrupts were enabled for deferred work like this. It would cover the vast majority of driver independencies by giving a 'last' point. At that point, you return an error if you still weren't ready, and we'd loop while the number of devices returning an error was declining. It could largely replace config_intrhook.
I think this would work reasonably well for the general case here; the other suggestions would solve the case of smu(4), but there are other, simpler, ways to do that.
That said, I'm not sure what the advantage of a DEVICE_ATTACH_LATE() is over the mechanism originally proposed here or something like it except that it, as a side effect, delays until after interrupts -- which might not even be wanted, for example if the device in question is an interrupt controller.
If the device is an interrupt controller, you need to put it on a different pass. There's no way around that. But this isn't an interrupt controller, and interrupt controllers are special beasts that already have special handling in many places. Having a delayed attach for them is likely completely unacceptable so none of the proposals is appropriate for them.
In D15229#333038, @imp wrote:The final option would be to do a variation on the prior option, but institutionalize it in DEVICE_ATTACH_LATE. It would be called just after interrupts were enabled for deferred work like this. It would cover the vast majority of driver independencies by giving a 'last' point. At that point, you return an error if you still weren't ready, and we'd loop while the number of devices returning an error was declining. It could largely replace config_intrhook.
Jun 5 2018
It's been another couple weeks. Any objections or approvals?
Jun 4 2018
May 30 2018
May 26 2018
This looks good to me, though I haven't tested on P8.
On POWER9, I think the lock is unnecessary (the ISA spec doesn't mention it).
May 25 2018
Any other thoughts? I'd like to get this in soon since it is blocking other things.
May 20 2018
Fix style and spelling issues in man pages.
Probe for pending devices at the end of each pass and after modules loaded. Add documentation.
See the comment on line 467 about powernv_smp_ap_init(). Otherwise, looks good.
May 19 2018
Approved if you also install it to EXC_HEA.
May 18 2018
In D15482#326710, @breno.leitao_gmail.com wrote:
- Is there a document describing this that we can reference?
I was not able to find any documentation. Looking at the web, I found a patch[1] that documents it quickly in Linux, but it seems it was never accepted. I will ask internally.
This looks good, even if the deice tree situation is a bit lamentable. Two questions:
- Is there a document describing this that we can reference?
- If this is just part of the PowerNV platform, maybe the code belongs in platform_powernv.c instead?
May 14 2018
Thanks -- I'll tweak the implementation to handle KLDs properly and write a man page, then update this diff.
May 13 2018
Any further comments? I'd really like to get this, or something like it, in.
May 8 2018
We should do the same thing (check fdtbootcpu rather than PIR) on PowerNV too. Thanks for the work on this patch!
May 7 2018
Looks good to me. I haven't tested it, of course -- is this working well for you?
This is an interesting problem that I need to think about -- thanks for identifying it! I think this particular patch is not the right one, so let's put a pin in this for a couple days.
The CPU is stored in the device tree in one of two ways:
- As an ihandle "cpu" in /chosen
- The "fdtbootcpu" integer property from /chosen
May 6 2018
May 4 2018
In D15229#321549, @jhb wrote:(For future reference, a diff with context is nicer.)
(BTW, while reviewing this I found the device_register() "gem" which is a bit of a hack that snuck in with iflib and should be replaced with an iflib_if.m and a dedicated IFLIB_REGISTER rather than looking at magic numbers in the first 4 bytes of a pointer, etc. That's just ridiculous.)
I assume you're waiting on docs until the design is ok'd (which is sensible), but once that is settled DEVICE_ATTACH.9 probably needs extending to document this.
May 1 2018
Apr 30 2018
Simplify logic dealing with pending attachments.
Apr 25 2018
We should fix that, too, though. This code assumes that we are booting on CPU 0 -- the same assumption could continue as a stopgap until we figure out a reliable way to evaluate the boot processor.
Apr 24 2018
Is there a reason not to just adopt the PowerNV version of this code, which already does the right thing?
Apr 23 2018
Apr 19 2018
Apr 18 2018
Apr 17 2018
This looks great, with one nit about the way you determine whether VSX is available.
Apr 5 2018
Mar 16 2018
I think there is also a subtlety for uncaught signals if the previous trap was a kernel fault, since you then end up in the wrong branch of the kernel/user if in trap() and probably crash.
Sure, it will deliver the signal. But the value of MSR when control is returned to userland may be wrong (it may even have MSR[PR] unset and return in supervisor mode if the previous trap on this CPU was in the kernel!) and it will resume execution in the wrong place if the signal is caught and handled.
Mar 15 2018
I'm looking at this again. Does this actually work? HEA interrupts set HSRR0/HSSR1, which we don't save, so I think this will jump back to some random other memory location (whatever happened to be in SRR0).
Mar 14 2018
Mar 13 2018
Use label arithmetic rather than instruction counting, which is much more robust.