Changeset View
Standalone View
sys/arm64/arm64/locore.S
Show First 20 Lines • Show All 244 Lines • ▼ Show 20 Lines | |||||
*/ | */ | ||||
drop_to_el1: | drop_to_el1: | ||||
mrs x23, CurrentEL | mrs x23, CurrentEL | ||||
lsr x23, x23, #2 | lsr x23, x23, #2 | ||||
cmp x23, #0x2 | cmp x23, #0x2 | ||||
b.eq 1f | b.eq 1f | ||||
ret | ret | ||||
1: | 1: | ||||
/* Configure the Hypervisor */ | /* | ||||
mov x2, #(HCR_RW) | * If the MMU is active, then it is using a page table where VA == PA. | ||||
* But the page table won't have entries for the hypervisor EL2 | |||||
* initialization code which is loaded into memory with the vmm module. | |||||
* | |||||
* So we disable the MMU in EL2 to make the vmm hypervisor code run | |||||
* successfully. | |||||
*/ | |||||
dsb sy | |||||
alexandru.elisei_arm.com: I'm not sure about the purpose of this instruction. I remember copying it from where we disable… | |||||
mrs x2, sctlr_el2 | |||||
bic x2, x2, SCTLR_M | |||||
msr sctlr_el2, x2 | |||||
Not Done Inline ActionsI haven't been reading the code, but I haven't found where the dcache is invalidated before the MMU is turned on at EL1. Where is that done? alexandru.elisei_arm.com: I haven't been reading the code, but I haven't found where the dcache is invalidated before the… | |||||
Done Inline ActionsIn the normal boot process we expect the bootloader to have invalidated the dcache andrew: In the normal boot process we expect the bootloader to have invalidated the dcache | |||||
Not Done Inline ActionsThe patch itself looks alright to me, but I haven't tested it. As for why I think FreeBSD should do dcache invalidation before turning the MMU back on, wall of text below. I've been reading the Arm ARM ARM DDI 0487F.b trying to find out exactly what a CPU is allowed to speculate, and I've only found what is *not* able to speculate (section B2.3.6 "Restrictions on the effects of speculation"). From that I assume the CPU is able to speculate *any* loads regardless of the program order, and modify the dcache as a result. This gets somewhat confirmed by the definition of the SB (Speculation Barrier) instruction (page C6-1188): "In particular, any instruction that appears later in the program order than the barrier cannot cause a speculative allocation into any caching structure where the allocation of that entry could be My interpretation of that statement is that without the barrier, the PE is allowed to modify the dcache as a result of speculation. This is the definition of Device memory from page B2-117: "The Arm architecture forbids Speculative reads of any type of Device memory. This means Device memory types are suitable attributes for read-sensitive Locations." When address translation is disabled, data accesses are to Device-nGnRnE memory type (page D5-2586). The point I am trying to make is that as long as the MMU is enabled (SCTLR_ELx.M == 1), the PE is allowed to speculate any loads and modify the dcache. This is something that FreeBSD allows the bootloader to do. The other side of the story is that even if FreeBSD starts execution with the MMU disabled, if it's running under a hypervisor, the dcaches might have been populated while the host was running on the same PE (either intentionally or as a result of speculation). As a result, dcache invalidation must be performed right before the MMU is turned on (after the last load in program order, to be more precise), so any speculative reads on the host side will populate the dcache with the latest values that the guest wrote to memory. I hope what I wrote above makes sense. alexandru.elisei_arm.com: The patch itself looks alright to me, but I haven't tested it.
As for why I think FreeBSD… | |||||
Done Inline ActionsWhich memory do you think needs d-cache management? As far as I can tell we should invalidate the d-cache for the page tables as there may be data for them in the cache, however we create them with the cache disabled. andrew: Which memory do you think needs d-cache management? As far as I can tell we should invalidate… | |||||
isb | |||||
/* Enable the HVC Instruction and Make EL1 aarch64 */ | |||||
ldr x2, hcr | |||||
msr hcr_el2, x2 | msr hcr_el2, x2 | ||||
/* Load the Virtualization Process ID Register */ | /* Load the Virtualization Process ID Register */ | ||||
mrs x2, midr_el1 | mrs x2, midr_el1 | ||||
msr vpidr_el2, x2 | msr vpidr_el2, x2 | ||||
/* Load the Virtualization Multiprocess ID Register */ | /* Load the Virtualization Multiprocess ID Register */ | ||||
mrs x2, mpidr_el1 | mrs x2, mpidr_el1 | ||||
Show All 13 Lines | 1: | ||||
/* Enable access to the physical timers at EL1 */ | /* Enable access to the physical timers at EL1 */ | ||||
mrs x2, cnthctl_el2 | mrs x2, cnthctl_el2 | ||||
orr x2, x2, #(CNTHCTL_EL1PCTEN | CNTHCTL_EL1PCEN) | orr x2, x2, #(CNTHCTL_EL1PCTEN | CNTHCTL_EL1PCEN) | ||||
msr cnthctl_el2, x2 | msr cnthctl_el2, x2 | ||||
/* Set the counter offset to a known value */ | /* Set the counter offset to a known value */ | ||||
msr cntvoff_el2, xzr | msr cntvoff_el2, xzr | ||||
/* Hypervisor trap functions */ | /* Install hypervisor trap functions */ | ||||
adr x2, hyp_vectors | adrp x2, hyp_stub_vectors | ||||
add x2, x2, :lo12:hyp_stub_vectors | |||||
msr vbar_el2, x2 | msr vbar_el2, x2 | ||||
/* Use the host VTTBR_EL2 to tell the host and the guests apart */ | |||||
mov x2, #VTTBR_HOST | |||||
msr vttbr_el2, x2 | |||||
mov x2, #(PSR_F | PSR_I | PSR_A | PSR_D | PSR_M_EL1h) | mov x2, #(PSR_F | PSR_I | PSR_A | PSR_D | PSR_M_EL1h) | ||||
msr spsr_el2, x2 | msr spsr_el2, x2 | ||||
/* Configure GICv3 CPU interface */ | /* Configure GICv3 CPU interface */ | ||||
mrs x2, id_aa64pfr0_el1 | mrs x2, id_aa64pfr0_el1 | ||||
/* Extract GIC bits from the register */ | /* Extract GIC bits from the register */ | ||||
ubfx x2, x2, #ID_AA64PFR0_GIC_SHIFT, #ID_AA64PFR0_GIC_BITS | ubfx x2, x2, #ID_AA64PFR0_GIC_SHIFT, #ID_AA64PFR0_GIC_BITS | ||||
/* GIC[3:0] == 0001 - GIC CPU interface via special regs. supported */ | /* GIC[3:0] == 0001 - GIC CPU interface via special regs. supported */ | ||||
cmp x2, #(ID_AA64PFR0_GIC_CPUIF_EN >> ID_AA64PFR0_GIC_SHIFT) | cmp x2, #(ID_AA64PFR0_GIC_CPUIF_EN >> ID_AA64PFR0_GIC_SHIFT) | ||||
b.ne 2f | b.ne 2f | ||||
mrs x2, icc_sre_el2 | mrs x2, icc_sre_el2 | ||||
orr x2, x2, #ICC_SRE_EL2_EN /* Enable access from insecure EL1 */ | orr x2, x2, #ICC_SRE_EL2_EN /* Enable access from insecure EL1 */ | ||||
orr x2, x2, #ICC_SRE_EL2_SRE /* Enable system registers */ | orr x2, x2, #ICC_SRE_EL2_SRE /* Enable system registers */ | ||||
msr icc_sre_el2, x2 | msr icc_sre_el2, x2 | ||||
2: | 2: | ||||
/* Set the address to return to our return address */ | /* Set the address to return to our return address */ | ||||
msr elr_el2, x30 | msr elr_el2, x30 | ||||
isb | isb | ||||
Not Done Inline ActionsAccording to the definition of a context synchronization event from ARM DDI 0487F.b, page Glossary-8112, the isb and eret instructions are equivalent. I believe this isb is redundant. alexandru.elisei_arm.com: According to the definition of a context synchronization event from ARM DDI 0487F.b, page… | |||||
eret | eret | ||||
.align 3 | .align 3 | ||||
.Lsctlr_res1: | .Lsctlr_res1: | ||||
.quad SCTLR_RES1 | .quad SCTLR_RES1 | ||||
#define VECT_EMPTY \ | hcr: | ||||
.align 7; \ | /* Make sure the HVC instruction is not disabled */ | ||||
1: b 1b | .quad (HCR_RW & ~HCR_HCD) | ||||
.align 11 | |||||
hyp_vectors: | |||||
VECT_EMPTY /* Synchronous EL2t */ | |||||
VECT_EMPTY /* IRQ EL2t */ | |||||
VECT_EMPTY /* FIQ EL2t */ | |||||
VECT_EMPTY /* Error EL2t */ | |||||
VECT_EMPTY /* Synchronous EL2h */ | |||||
VECT_EMPTY /* IRQ EL2h */ | |||||
VECT_EMPTY /* FIQ EL2h */ | |||||
VECT_EMPTY /* Error EL2h */ | |||||
VECT_EMPTY /* Synchronous 64-bit EL1 */ | |||||
VECT_EMPTY /* IRQ 64-bit EL1 */ | |||||
VECT_EMPTY /* FIQ 64-bit EL1 */ | |||||
VECT_EMPTY /* Error 64-bit EL1 */ | |||||
VECT_EMPTY /* Synchronous 32-bit EL1 */ | |||||
VECT_EMPTY /* IRQ 32-bit EL1 */ | |||||
VECT_EMPTY /* FIQ 32-bit EL1 */ | |||||
VECT_EMPTY /* Error 32-bit EL1 */ | |||||
/* | /* | ||||
* Get the delta between the physical address we were loaded to and the | * Get the delta between the physical address we were loaded to and the | ||||
* virtual address we expect to run from. This is used when building the | * virtual address we expect to run from. This is used when building the | ||||
* initial page table. | * initial page table. | ||||
*/ | */ | ||||
get_virt_delta: | get_virt_delta: | ||||
/* Load the physical address of virt_map */ | /* Load the physical address of virt_map */ | ||||
▲ Show 20 Lines • Show All 510 Lines • Show Last 20 Lines |
I'm not sure about the purpose of this instruction. I remember copying it from where we disable the MMU from EL1 (after drop_to_el1 returns), but now I can't figure out why this is used. If the MMU is off, it doesn't do anything because data accesses are to Device-nGnRnE memory according to ARM DDI 0487F.b, page D5-2586 (the PE reads and writes directly to main memory).
If the MMU is on, I believe this still has no effect because: