Page MenuHomeFreeBSD

Execute PowerPC64/AIM kernel from direct map region when possible
ClosedPublic

Authored by nwhitehorn on Mar 11 2018, 6:06 AM.

Details

Summary

When the kernel can be in real mode in early boot, we can execute from high addresses aliased to the kernel's physical memory. If that high address has the first two bits set to 1 (0xc...), those addresses will automatically become part of the direct map. This does two useful things for us: it reduces page table pressure from the kernel and it sets up the kernel to be used with radix translation, for which it has to be up here. From an architectural perspective, the mechanism that allows this will also be useful for Book-E (which there is a chance I have broken horribly with this patch), as it in principle allows all of the first-stage boot code involving the MMU in booke/locore.S to be removed and replaced with a much shorter C version that allows more complicated situations (larger kernels, for example).

This is accomplished by exploiting the fact that all PowerPC kernels are built as position-independent executables and relocate themselves on start. Before this patch, the kernel runs at 1:1 VA:PA, but that VA/PA is random and set by the bootloader. Very early, it processes its ELF relocations to operate wherever it happens to find itself. This patch calls a new early function, aim_early_init(), from near the beginning of powerpc_init(). (This function has access to all kernel services set up by that time, including all globals at their normal addresses.) It checks if the kernel is in the right place and should be moved to high memory. If so, it calls __start() a second time at a high aliased address. The ELF relocations are then processed a second time at the high aliased address, transparently relinking the kernel in high memory. The second time around, aim_early_init() returns normally and powerpc_init() continues as before.

Tested on pSeries (PowerKVM), Apple G5 (change is a no-op because the kernel is in virtual mode early and cannot use aliased addresses), PS3, and PowerNV (QEMU). The interesting parts of the patch are in aim_machdep.c, machdep.c, and locore64.S. The remainder of the patch consists of fixes to latent bugs in other code brought to the fore by this change.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 15536

Event Timeline

nwhitehorn created this revision.Mar 11 2018, 6:06 AM
nwhitehorn edited the summary of this revision. (Show Details)Mar 11 2018, 6:07 AM

Looks fine to me aside from the one nit. There's no risk to Book-E on this, since mdp is always NULL currently, at least to my testing.

ofw/ofwcall64.S
154 ↗(On Diff #40148)

Label math is nicer than counting instructions. I converted counted instructions to label math in booke/locore.S just in case already.

nwhitehorn updated this revision to Diff 40235.Mar 13 2018, 4:26 AM

Use label arithmetic rather than instruction counting, which is much more robust.

jhibbits accepted this revision.Mar 13 2018, 1:33 PM

Looks good

This revision is now accepted and ready to land.Mar 13 2018, 1:33 PM