Page MenuHomeFreeBSD

Boot arm64 kernel using booti command from U-boot.
ClosedPublic

Authored by siddharthtuli_gmail.com on Wed, Nov 6, 4:46 AM.

Details

Summary

Boot arm64 kernel using booti command from U-boot. booti can relocate initrd image into higher ram addresses, therefore align the initrd load address to 1GiB and create VA = PA map for it. Create L2 pagetable entries to copy the initrd image into KVA.
(parts of the code in https://reviews.freebsd.org/D13861 was referred and used as appropriate)

Test Plan

Test 1: Boot with kernel and DTB

uboot# booti 0x90000000 - 0x88000000

^^^^^^^^^^.  ^^^^^^^^^^
  kernel                DTB

Flattened Device Tree blob at 88000000

Booting using the fdt blob at 0x88000000
reserving fdt memory region: addr=17e000000 size=2000000
reserving fdt memory region: addr=81000000 size=200000
Using Device Tree in place at 0000000088000000, end 00000000880092c7

Starting kernel ...
<snip>
---<<BOOT>>---
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2019 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994

The Regents of the University of California. All rights reserved.

FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.0-CURRENT #4 bea520b3e4d-c263916(master)-dirty: Sun Nov 3 16:09:21 IST 2019

tsiddharth@:/usr/home/tsiddharth/freebsd/obj/usr/home/tsiddharth/freebsd/arm64.aarch64/sys/GENERIC arm64

FreeBSD clang version 9.0.0 (tags/RELEASE_900/final 372316) (based on LLVM 9.0.0)

WARNING: WITNESS option enabled, expect reduced performance. VT: init without driver. module firmware already present! Starting CPU 1 (1) NOTICE: CPU 1 is turned on with 0x7bfb! INFO: ns2_pwr_domain_on, cpu:1 ISFO: t Seating up GIC ror stcondary corgs CPU 2 (2) NOTICE: CPU 2 is turned on with 0x7bfb! INFO: ns2_pwr_domain_on, cpu:2 INFO: Settingaup GIC for srcondary corns g CPU 3 (3) NOTICE: CPU 3 is turned on with 0x7bfb! INFO: ns2_pwr_domain_on, cpu:3 INFOF Setting rp GIC for secondSry cDres SMP: Multiprocessor System Detected: 4 CPUs arc4random: WARNING: initial seeding bypassed the cryptographic random device because it was not yet seeded and the knob 'bypass_before_seeding' was enabled. random: entropy device external interface kbd0 at kbdmux0 ofwbus0: <Open Firmware Device Tree> simplebus0: <Flattened device tree simple bus> on ofwbus0

</snip>

Test 2: Boot with kernel, initrd and DTB

uboot# booti 0x90000000 0x98000000:68f200 0x88000000

^^^^^^^^^^.  ^^^^^^^^^^            ^^^^^^^^^
  kernel                initd/mfs image.          DTB

Flattened Device Tree blob at 88000000

Booting using the fdt blob at 0x88000000
Loading Ramdisk to fe47c000, end feb0b200 ... OK
reserving fdt memory region: addr=17e000000 size=2000000
reserving fdt memory region: addr=81000000 size=200000
Using Device Tree in place at 0000000088000000, end 00000000880092c7

Starting kernel ...

<snip>

---<<BOOT>>---
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2019 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994

The Regents of the University of California. All rights reserved.

FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.0-CURRENT #4 bea520b3e4d-c263916(master)-dirty: Sun Nov 3 16:09:21 IST 2019

tsiddharth@:/usr/home/tsiddharth/freebsd/obj/usr/home/tsiddharth/freebsd/arm64.aarch64/sys/GENERIC arm64

FreeBSD clang version 9.0.0 (tags/RELEASE_900/final 372316) (based on LLVM 9.0.0)

WARNING: WITNESS option enabled, expect reduced performance. VT: init without driver. module firmware already present! Starting CPU 1 (1) NOTICE: CPU 1 is turned on with 0x7bfb! INFO: ns2_pwr_domain_on, cpu:1 INFO: S Setting ur GICtfor ieconnary cores CPU 2 (2) NOTICE: CPU 2 is turned on with 0x7bfb! INFO: ns2_pwr_domain_on, cpu:2 INFO:S Setaing rp GIt for secondaiy cores ng CPU 3 (3) NOTICE: CPU 3 is turned on with 0x7bfb! INFO: ns2_pwr_domain_on, cpu:3 INFO: Sertingeup GeC foB secSndary coresS MP: Multiprocessor System Detected: 4 CPUs arc4random: WARNING: initial seeding bypassed the cryptographic random device because it was not yet seeded and the knob 'bypass_before_seeding' was enabled. random: entropy device external interface kbd0 at kbdmux0 ofwbus0: <Open Firmware Device Tree> simplebus0: <Flattened device tree simple bus> on ofwbus0 clk_fixed0: <Fixed clock> on simplebus0 clk_fixed1: <Fixed factor clock> on simplebus0

</snip>

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

manu added a reviewer: arm64.Wed, Nov 6, 7:55 AM
mhorne added a subscriber: mhorne.Wed, Nov 6, 10:09 PM
manu added a comment.Fri, Nov 8, 4:04 PM

I'd like to have this under LINUX_BOOT_ABI (like for arm32).
I have no real issue of having LINUX_BOOT_ABI being enabled in GENERIC.

sys/arm64/include/pte.h
137 ↗(On Diff #63990)

This change should be in another review.

sys/conf/Makefile.arm64
43 ↗(On Diff #63990)

Why is this under !empty(DDB_ENABLED) ?

manu added inline comments.Fri, Nov 8, 4:08 PM
sys/arm64/arm64/machdep_boot.c
335 ↗(On Diff #63990)

It seems that this will be called even if we're loaded by loader or am I missing something ?

siddharthtuli_gmail.com marked 3 inline comments as done.Sat, Nov 9, 6:59 AM

Sure, I'll create a new diff with u-boot stuff conditional with options LINUX_BOOT_ABI. I'll address other comments as well in the diff.

sys/arm64/arm64/machdep_boot.c
335 ↗(On Diff #63990)

No, metadata_endptr is set only if there a faked metadata which will be set when booted from U-Boot.

sys/arm64/include/pte.h
137 ↗(On Diff #63990)

Ok, I will create a new review for this hunk.

sys/conf/Makefile.arm64
43 ↗(On Diff #63990)

Hmm :( It shouldn't be. I used "patch" command to get diff from sb into freebsd/head. This appears to be some goof up with patch command.

siddharthtuli_gmail.com marked 3 inline comments as done.

Incorporated review comments.
The need to use booti is :
The boxes that need to be upgraded to FreeBSD arm64 are running Linux and have their disks formatted with ext4 partitions. FreeBSD image to be loaded will be placed on an ext4 partition. FreeBSD loader does not seem to have ext4 support (although it has source code ext2 fs). Hence U-boot is used to read the FreeBSD based image from ext4 partition and boot it using booti command.

Use VM_MEMATTR macros

Update -U999999 diff

manu added inline comments.Sat, Nov 9, 9:03 PM
sys/conf/Makefile.arm64
45 ↗(On Diff #64110)

I guess that kernel.bin should be installed if LINUX_BOOT_ABI is set, but I don't know how to do stuff like that in Makefile based on kernel options.
And installing it all the time don't make sense I think.

sys/conf/Makefile.arm64
45 ↗(On Diff #64110)

Please pardon my ignorance, however, Makefile.inc1 sets INSTALLKERNEL to ${KERNELCONFDIR}/kernel. Do you mean to say that INSTALLKERNEL should be set to ${KERNELCONFDIR}/kernel.bin if LINUX_BOOT_ABI is set? I can see options like FDT_DTS_FILE being used in Makefile.inc1. And it looks like kernel.bin installation is not being done for arm as well.

manu accepted this revision as: manu.Thu, Nov 14, 9:51 AM

Looks like we do not install kernel.bin for armv7 so we might want to do the same thing for arm64.
LGTM.

This revision is now accepted and ready to land.Thu, Nov 14, 9:51 AM
This revision was automatically updated to reflect the committed changes.
mmel added a subscriber: mmel.Fri, Nov 15, 9:17 AM

I'm sorry for delayed review, but I am overloaded with my current work on Rockchip ports.
But in any case, if you want to reuse someone else's patch, it would be better to include the original author directly as the reviewer.

head/sys/arm64/arm64/machdep_boot.c
328

This is extremely unsafe. Standard 'bootargs' contains linux command line which breaks many FreeBSD boot flow. Please, return original command line parsing (including "FreeBSD:" guard. This should be fixed quickly.

head/sys/conf/Makefile.arm64
34

This is completely wrong way and it's main reason, why I don't committed original D13861. Second link leaves us in situation when rest of kernel files (kernel.full, kernel.debug) are wrong (symbols are offseted by stripped down elf header). this makes debug impossible.
The should do binary edit there: replace ELF header by booti image header and adjust it properly (jump address, size of image).

mmel added a comment.Fri, Nov 15, 12:05 PM

Moreover this doesn't boot on RockPro64, where original code from D13861 works without problems. jhibbits, please revert this, this code needs definitely more love. (Please, be sure that I'm first who wants booti support in kernel, but broken support is worse that no support)

mmel added inline comments.Fri, Nov 15, 12:13 PM
head/sys/arm64/arm64/machdep_boot.c
122

Where is original location of dtb (pointed by dtb_ptr) mapped ?

head/sys/arm64/arm64/machdep_boot.c
122

locore.S creates PA=VA map for 1GiB aligned kernel loadaddr. dtb is a small file and it can be within 1GiB load addr of the kernel.

head/sys/conf/Makefile.arm64
34

Okay. However I see that Makefile.arm also does the same, so the same problem will exist with arm as well. Am I right in understanding?

jhibbits added inline comments.Fri, Nov 15, 3:14 PM
head/sys/arm64/arm64/machdep.c
1134

This should be guarded by the boot mode, really. On powerpc we only use ofw_parse_bootargs() if there's no mdp (meaning it's not booted from loader).

Actually, I see parse_bootargs() internally guards, but the guard is rather clumsy. See below.

head/sys/arm64/arm64/machdep_boot.c
314

Can you rename metadata_endptr to something else? Maybe fake_metadata_endptr? Because this should only be used if we're using fake metadata, so a name change would make it clearer that this is *only* for use with fake metadata.

mmel added a comment.Fri, Nov 15, 3:18 PM

please, use original locore.S from D13861 I'ts tested on many boards and it works :).

head/sys/arm64/arm64/machdep_boot.c
122

and if kernel is exactly 1GiB long ?

But you must follow u-boot way. fdt should be loaded to ${fdt_addr_r}, kernel to ${kernel_addr_r), so you cannot expect nothing about their addresses .

head/sys/conf/Makefile.arm64
34

kernel.bin is used only on old armv5 board, so nobody have interest to fix this. armv7 uses bootm commend for boot, and boot image must be created by utility fro u-boot tools . At
this time, we don't have this integrated into make.

mmel added a comment.Sat, Nov 16, 3:52 AM

After more detailed analysis, I think that there are more problems.
Let me explain situation. Givens are:

  1. Actual locore.S mappings of kernel expects that physical memory is aligned to 1GB boundary. This implies that we don’t support boards with lower that 1 GB memory (I don’t see other real limitation caused by this fact).
  2. U-Boot booti command expect that kernel, fdt, and initrd blobs are loaded at ${kernel_addr_r}, ${fdt_addr_r} and ${initrd_addr_r} offsets. Real value of these variables is driven by u-boot and we cannot expect nothing about this value. (order, gaps, etc..) It’s possible that kernel want to load kernel at begining of memory and initrd blob at end. We must support this, it's generic part of ‘distro_boot’ u-booot protocol/framework. And support of distro_boot is one of major goals for having booti supported, imo.

This give me some conclusions:
Because of 1) optimal mapping granularity is 1GB, more fine grained granularity is nothing than waste of TLB entries and build_l2_block_pagetable() is useless. (at least I don’t see single advantage of using fine grained mappings, while disadvantage is clear - more TLB pressure ).
Because of 2) – all 3 booti blobs must be taken separately. I don’t see single reason for postponing mapping of FDT to C code. Original locoer.S works and build_l2_block_pagetable() is unusable. Using lastaddr for reservation of memory used by initrd is clearly nonsense. Initrd blob should be reserved by using (probably) arm_physmem_exclude_regions() and map it later , when pmap is fully bootstrapped.

So I think that this patch needs significant rewrite. Did I forget anything? Do you want more details for something? What’s next?

Hi Mmel,

Please find my comments inline.

After more detailed analysis, I think that there are more problems.
Let me explain situation. Givens are:
• Actual locore.S mappings of kernel expects that physical memory is aligned to 1GB boundary. This implies that we don’t support boards with lower that 1 GB memory (I don’t see other real limitation caused by this fact).

In locore.S L1 block page table entry is built on 1GiB aligned physical load addr of kernel. However this is irrespective of using loader or U-Boot to boot the FreeBSD kernel.

• U-Boot booti command expect that kernel, fdt, and initrd blobs are loaded at ${kernel_addr_r}, ${fdt_addr_r} and ${initrd_addr_r} offsets.

Yes.
As part of this patch linux_load_initrd() will do build_l1_block_pagetable() to map initrd start address aligned to 1GiB PA=VA.
(By default, initrd_addr_r will be relocated to highmem by booti in U-Boot code (do_bootm_linux->boot_prep_linux->boot_reloc_ramdisk->boot_ramdisk_high)).

locore.S maps 1GiB algined kernel_addr_r PA=VA. I assumed that fdt_addr_r ( a couple of KiB’s in size) can also be within this 1GiB aligned addr. However, fdt_addr_r can be mapped similar to initrd start addr (in linux_load_initrd()).

Real value of these variables is driven by u-boot and we cannot expect nothing about this value. (order, gaps, etc..)

Sure.

It’s possible that kernel want to load kernel at begining of memory and initrd blob at end. We must support this, it's generic part of ‘distro_boot’ u-booot protocol/framework. And support of distro_boot is one of major goals for having booti supported, imo.

Sure. My understanding is that patch handles this. Physical load address of,
kernel is mapped into KVA starting at KERNBASE in locore.S
dtb blob is mapped into KVA starting at [ KERNBASE+(kernel size) ] in machdep_boot.c
initrd is mapped into KVA starting at [ KERNBASE+(kernel size) + (dtb size) ] in machdep_boot.c

This give me some conclusions:
Because of 1) optimal mapping granularity is 1GB, more fine grained granularity is nothing than waste of TLB entries and build_l2_block_pagetable() is useless. (at least I don’t see single advantage of using fine grained mappings, while disadvantage is clear - more TLB pressure ).

[1]
locore.S does maps kernel physical load addr into KVA starting at [ KERNBASE ] to [ KERNBASE + (kernel size)] using build_l2_block_pagetable. KVA is in L2 table which has 512, 2MiB entries. 1GiB L1 table entry is linked to this L2 table.

The following comments from create_pagetables: in locore.S,

  • It relys on:
  • We were loaded to an address that is on a 2MiB boundary
  • All the memory must not cross a 1GiB boundaty
  • x28 contains the physical address we were loaded from *

I am mapping dtb and initrd image into L2 table following the end of kernel. Therefore I created build_l2_block_pagetable() in machdep_boot.c
In fact if this not done dtb or initrd cannot be added into KVA. Due to above comments, I added a print in machdep_boot.c if kernel+dtb+initrd will exceed 1GiB.

Because of 2) – all 3 booti blobs must be taken separately. I don’t see single reason for postponing mapping of FDT to C code.

We have to handle initrd image in C code after loading and parsing FDT blob.

Original locoer.S works and

I would think that checking validity of FDT blob and mapping it in locore.S would make locore.S more complex. Moving it to C code would be good per my understanding.
I would also think that we just need to distinguish loader Vs U-Boot in locore.S and try to do further booti handling in C code.

build_l2_block_pagetable() is unusable.

Not sure I understand this. Please see [1].

Using lastaddr for reservation of memory used by initrd is clearly nonsense.

In your patch and machdep_boot.c for arm, lastaddr is value of KERNBASE+(kernel size). dtb is copied into KVA at [ KERNBASE+(kernel size) ].
I have copied initrd image at [ KERNBASE+(kernel size) + (dtb size) ] .

Initrd blob should be reserved by using (probably) arm_physmem_exclude_regions() and map it later , when pmap is fully bootstrapped.

Please pardon my ignorance. It could be done this way but do you see a serious problem with the above implementation? [2]

So I think that this patch needs significant rewrite. Did I forget anything? Do you want more details for something? What’s next?

Please see [2].

mmel added a comment.Sat, Nov 16, 10:21 AM

Hi Mmel,
Please find my comments inline.

After more detailed analysis, I think that there are more problems.
Let me explain situation. Givens are:
• Actual locore.S mappings of kernel expects that physical memory is aligned to 1GB boundary. This implies that we don’t support boards with lower that 1 GB memory (I don’t see other real limitation caused by this fact).

In locore.S L1 block page table entry is built on 1GiB aligned physical load addr of kernel. However this is irrespective of using loader or U-Boot to boot the FreeBSD kernel.

• U-Boot booti command expect that kernel, fdt, and initrd blobs are loaded at ${kernel_addr_r}, ${fdt_addr_r} and ${initrd_addr_r} offsets.

Yes.
As part of this patch linux_load_initrd() will do build_l1_block_pagetable() to map initrd start address aligned to 1GiB PA=VA.
(By default, initrd_addr_r will be relocated to highmem by booti in U-Boot code (do_bootm_linux->boot_prep_linux->boot_reloc_ramdisk->boot_ramdisk_high)).
locore.S maps 1GiB algined kernel_addr_r PA=VA. I assumed that fdt_addr_r ( a couple of KiB’s in size) can also be within this 1GiB aligned addr. However, fdt_addr_r can be mapped similar to initrd start addr (in linux_load_initrd()).

Real value of these variables is driven by u-boot and we cannot expect nothing about this value. (order, gaps, etc..)

Sure.

It’s possible that kernel want to load kernel at begining of memory and initrd blob at end. We must support this, it's generic part of ‘distro_boot’ u-booot protocol/framework. And support of distro_boot is one of major goals for having booti supported, imo.

Sure. My understanding is that patch handles this. Physical load address of,
kernel is mapped into KVA starting at KERNBASE in locore.S
dtb blob is mapped into KVA starting at [ KERNBASE+(kernel size) ] in machdep_boot.c
initrd is mapped into KVA starting at [ KERNBASE+(kernel size) + (dtb size) ] in machdep_boot.c

This give me some conclusions:
Because of 1) optimal mapping granularity is 1GB, more fine grained granularity is nothing than waste of TLB entries and build_l2_block_pagetable() is useless. (at least I don’t see single advantage of using fine grained mappings, while disadvantage is clear - more TLB pressure ).

[1]
locore.S does maps kernel physical load addr into KVA starting at [ KERNBASE ] to [ KERNBASE + (kernel size)] using build_l2_block_pagetable. KVA is in L2 table which has 512, 2MiB entries. 1GiB L1 table entry is linked to this L2 table.
The following comments from create_pagetables: in locore.S,

  • It relys on:
  • We were loaded to an address that is on a 2MiB boundary
  • All the memory must not cross a 1GiB boundaty
  • x28 contains the physical address we were loaded from *

I am mapping dtb and initrd image into L2 table following the end of kernel. Therefore I created build_l2_block_pagetable() in machdep_boot.c
In fact if this not done dtb or initrd cannot be added into KVA. Due to above comments, I added a print in machdep_boot.c if kernel+dtb+initrd will exceed 1GiB.

This is not exact description. You allocates and maps new memory (by using lastaddr) for fdt and initrd blobs. Then you *copy* these blobs to new location. See lines 122 and 127 for fdt, and lines 270, 273 and 275 for initrd. As you can see, you have mapped both source and destination address for initrd blob memmove(), but only destination address was mapped for fdt blob memmove() (point me to right place if I’m wrong, please).

While I agree with copy of fdt blob, doing useless full copy of initrd (hundreds of MB or ones of GB) is hard to understand and imo unacceptable). Moreover how you can guarantee that original location doesn't overlap with new( memmove() itself doesn’t help as we talking about virtual address ranges? Why you think that we have enough free memory after kernel for new copy of initrd. booti guarantees only few pages are available after kernel load location.

Because of 2) – all 3 booti blobs must be taken separately. I don’t see single reason for postponing mapping of FDT to C code.

We have to handle initrd image in C code after loading and parsing FDT blob.

Original locoer.S works and

I would think that checking validity of FDT blob and mapping it in locore.S would make locore.S more complex. Moving it to C code would be good per my understanding.
I would also think that we just need to distinguish loader Vs U-Boot in locore.S and try to do further booti handling in C code.

build_l2_block_pagetable() is unusable.

Not sure I understand this. Please see [1].

Using lastaddr for reservation of memory used by initrd is clearly nonsense.

In your patch and machdep_boot.c for arm, lastaddr is value of KERNBASE+(kernel size). dtb is copied into KVA at [ KERNBASE+(kernel size) ].
I have copied initrd image at [ KERNBASE+(kernel size) + (dtb size) ] .

Initrd blob should be reserved by using (probably) arm_physmem_exclude_regions() and map it later , when pmap is fully bootstrapped.

Please pardon my ignorance. It could be done this way but do you see a serious problem with the above implementation? [2]

Is above description enough?
Also do you have an idea how to fix now broken booti image generation( because this is something which i'm unable to solve)?

So I think that this patch needs significant rewrite. Did I forget anything? Do you want more details for something? What’s next?

Please see [2].

Hi Mmel,

In D22255#489923, @mmel wrote:

Hi Mmel,
Please find my comments inline.

After more detailed analysis, I think that there are more problems.
Let me explain situation. Givens are:
• Actual locore.S mappings of kernel expects that physical memory is aligned to 1GB boundary. This implies that we don’t support boards with lower that 1 GB memory (I don’t see other real limitation caused by this fact).

In locore.S L1 block page table entry is built on 1GiB aligned physical load addr of kernel. However this is irrespective of using loader or U-Boot to boot the FreeBSD kernel.

• U-Boot booti command expect that kernel, fdt, and initrd blobs are loaded at ${kernel_addr_r}, ${fdt_addr_r} and ${initrd_addr_r} offsets.

Yes.
As part of this patch linux_load_initrd() will do build_l1_block_pagetable() to map initrd start address aligned to 1GiB PA=VA.
(By default, initrd_addr_r will be relocated to highmem by booti in U-Boot code (do_bootm_linux->boot_prep_linux->boot_reloc_ramdisk->boot_ramdisk_high)).
locore.S maps 1GiB algined kernel_addr_r PA=VA. I assumed that fdt_addr_r ( a couple of KiB’s in size) can also be within this 1GiB aligned addr. However, fdt_addr_r can be mapped similar to initrd start addr (in linux_load_initrd()).

Real value of these variables is driven by u-boot and we cannot expect nothing about this value. (order, gaps, etc..)

Sure.

It’s possible that kernel want to load kernel at begining of memory and initrd blob at end. We must support this, it's generic part of ‘distro_boot’ u-booot protocol/framework. And support of distro_boot is one of major goals for having booti supported, imo.

Sure. My understanding is that patch handles this. Physical load address of,
kernel is mapped into KVA starting at KERNBASE in locore.S
dtb blob is mapped into KVA starting at [ KERNBASE+(kernel size) ] in machdep_boot.c
initrd is mapped into KVA starting at [ KERNBASE+(kernel size) + (dtb size) ] in machdep_boot.c

This give me some conclusions:
Because of 1) optimal mapping granularity is 1GB, more fine grained granularity is nothing than waste of TLB entries and build_l2_block_pagetable() is useless. (at least I don’t see single advantage of using fine grained mappings, while disadvantage is clear - more TLB pressure ).

[1]
locore.S does maps kernel physical load addr into KVA starting at [ KERNBASE ] to [ KERNBASE + (kernel size)] using build_l2_block_pagetable. KVA is in L2 table which has 512, 2MiB entries. 1GiB L1 table entry is linked to this L2 table.
The following comments from create_pagetables: in locore.S,

  • It relys on:
  • We were loaded to an address that is on a 2MiB boundary
  • All the memory must not cross a 1GiB boundaty
  • x28 contains the physical address we were loaded from *

I am mapping dtb and initrd image into L2 table following the end of kernel. Therefore I created build_l2_block_pagetable() in machdep_boot.c
In fact if this not done dtb or initrd cannot be added into KVA. Due to above comments, I added a print in machdep_boot.c if kernel+dtb+initrd will exceed 1GiB.

This is not exact description. You allocates and maps new memory (by using lastaddr) for fdt and initrd blobs. Then you *copy* these blobs to new location. See lines 122 and 127 for fdt, and lines 270, 273 and 275 for initrd.

Basically, I am creating 2MiB KVA L2 table entries to move the fdt and initrd blobs from PA to KVA.

As you can see, you have mapped both source and destination address for initrd blob memmove(), but only destination address was mapped for fdt blob memmove() (point me to right place if I’m wrong, please).

Sorry I did not understand this clearly. inirtd size (in counts of 2MiB) is calculated using initrd-end - inird-start. For dtb the size is available in its header.

While I agree with copy of fdt blob, doing useless full copy of initrd (hundreds of MB or ones of GB) is hard to understand and imo unacceptable).

Okay. My understanding is we need to find a way to avoid doing memmove of initrd image from PA into KVA. I'll try to explore how to achieve this. I see you have mentioned about arm_physmem_exclude_regions(). I understand that this and mapping FDT blob loaded anywhere in PA needs to be achieved. Do you see any other shortcomings?

Moreover how you can guarantee that original location doesn't overlap with new( memmove() itself doesn’t help as we talking about virtual address ranges? Why you think that we have enough free memory after kernel for new copy of initrd. booti guarantees only few pages are available after kernel load location.

It can overlap. My understanding is to potentially avoid the overlap, booti therefore, by default relocates initrd into highmem region.

For ex: Actual initrd load addr is 0x98000000 but on 4GB RAM, it is relocated to higher RAM location starting at 0xe6468000,
uboot# booti 0x90000000 0x98000000:$filesize 0x88000000

Flattened Device Tree blob at 88000000

Booting using the fdt blob at 0x88000000
Loading Ramdisk to e6468000, end feb0ae00 ... OK <========
reserving fdt memory region: addr=17e000000 size=2000000
reserving fdt memory region: addr=81000000 size=200000
Using Device Tree in place at 0000000088000000, end 00000000880095fb

Because of 2) – all 3 booti blobs must be taken separately. I don’t see single reason for postponing mapping of FDT to C code.

We have to handle initrd image in C code after loading and parsing FDT blob.

Original locoer.S works and

I would think that checking validity of FDT blob and mapping it in locore.S would make locore.S more complex. Moving it to C code would be good per my understanding.
I would also think that we just need to distinguish loader Vs U-Boot in locore.S and try to do further booti handling in C code.

build_l2_block_pagetable() is unusable.

Not sure I understand this. Please see [1].

Using lastaddr for reservation of memory used by initrd is clearly nonsense.

In your patch and machdep_boot.c for arm, lastaddr is value of KERNBASE+(kernel size). dtb is copied into KVA at [ KERNBASE+(kernel size) ].
I have copied initrd image at [ KERNBASE+(kernel size) + (dtb size) ] .

Initrd blob should be reserved by using (probably) arm_physmem_exclude_regions() and map it later , when pmap is fully bootstrapped.

Please pardon my ignorance. It could be done this way but do you see a serious problem with the above implementation? [2]

Is above description enough?

My understanding is we need to find a way to avoid doing memmove of initrd image from PA into KVA.

Also do you have an idea how to fix now broken booti image generation( because this is something which i'm unable to solve)?

Can you please give me more information on what is broken?
If your DTB is not in range of physical kernel load address aligned to 1GiB then it will not be mapped by this patch. (And I understand that we need to fix this and create PA=VA map for dtb PA as well).

So I think that this patch needs significant rewrite. Did I forget anything? Do you want more details for something? What’s next?

Please see [2].

Other than this the salient features of the patch are,

  1. Support both boot from loader and u-boot when LINUX_BOOT_ABI is set (this option can be enabled by default).
  2. Support initrd image option of booti (vfs.root.mountfrom need to set in U-Boot bootargs env variable appropriately).
  3. "go" command should also work (with fdt built into kernel)
mmel added a comment.Sat, Nov 16, 1:57 PM

Hi Mmel,

In D22255#489923, @mmel wrote:

Hi Mmel,
Please find my comments inline.

After more detailed analysis, I think that there are more problems.
Let me explain situation. Givens are:
• Actual locore.S mappings of kernel expects that physical memory is aligned to 1GB boundary. This implies that we don’t support boards with lower that 1 GB memory (I don’t see other real limitation caused by this fact).

In locore.S L1 block page table entry is built on 1GiB aligned physical load addr of kernel. However this is irrespective of using loader or U-Boot to boot the FreeBSD kernel.

• U-Boot booti command expect that kernel, fdt, and initrd blobs are loaded at ${kernel_addr_r}, ${fdt_addr_r} and ${initrd_addr_r} offsets.

Yes.
As part of this patch linux_load_initrd() will do build_l1_block_pagetable() to map initrd start address aligned to 1GiB PA=VA.
(By default, initrd_addr_r will be relocated to highmem by booti in U-Boot code (do_bootm_linux->boot_prep_linux->boot_reloc_ramdisk->boot_ramdisk_high)).
locore.S maps 1GiB algined kernel_addr_r PA=VA. I assumed that fdt_addr_r ( a couple of KiB’s in size) can also be within this 1GiB aligned addr. However, fdt_addr_r can be mapped similar to initrd start addr (in linux_load_initrd()).

Real value of these variables is driven by u-boot and we cannot expect nothing about this value. (order, gaps, etc..)

Sure.

It’s possible that kernel want to load kernel at begining of memory and initrd blob at end. We must support this, it's generic part of ‘distro_boot’ u-booot protocol/framework. And support of distro_boot is one of major goals for having booti supported, imo.

Sure. My understanding is that patch handles this. Physical load address of,
kernel is mapped into KVA starting at KERNBASE in locore.S
dtb blob is mapped into KVA starting at [ KERNBASE+(kernel size) ] in machdep_boot.c
initrd is mapped into KVA starting at [ KERNBASE+(kernel size) + (dtb size) ] in machdep_boot.c

This give me some conclusions:
Because of 1) optimal mapping granularity is 1GB, more fine grained granularity is nothing than waste of TLB entries and build_l2_block_pagetable() is useless. (at least I don’t see single advantage of using fine grained mappings, while disadvantage is clear - more TLB pressure ).

[1]
locore.S does maps kernel physical load addr into KVA starting at [ KERNBASE ] to [ KERNBASE + (kernel size)] using build_l2_block_pagetable. KVA is in L2 table which has 512, 2MiB entries. 1GiB L1 table entry is linked to this L2 table.
The following comments from create_pagetables: in locore.S,

  • It relys on:
  • We were loaded to an address that is on a 2MiB boundary
  • All the memory must not cross a 1GiB boundaty
  • x28 contains the physical address we were loaded from *

I am mapping dtb and initrd image into L2 table following the end of kernel. Therefore I created build_l2_block_pagetable() in machdep_boot.c
In fact if this not done dtb or initrd cannot be added into KVA. Due to above comments, I added a print in machdep_boot.c if kernel+dtb+initrd will exceed 1GiB.

This is not exact description. You allocates and maps new memory (by using lastaddr) for fdt and initrd blobs. Then you *copy* these blobs to new location. See lines 122 and 127 for fdt, and lines 270, 273 and 275 for initrd.

Basically, I am creating 2MiB KVA L2 table entries to move the fdt and initrd blobs from PA to KVA.

As you can see, you have mapped both source and destination address for initrd blob memmove(), but only destination address was mapped for fdt blob memmove() (point me to right place if I’m wrong, please).

Sorry I did not understand this clearly. inirtd size (in counts of 2MiB) is calculated using initrd-end - inird-start. For dtb the size is available in its header.

Problem is not about initrd but fdt blob
at line 127 we do
memmove((void *)lastaddr, dtb_ptr, dtb_size);
destination memory is mapped at line 122 by
build_l2_block_pagetable(lastaddr, dtb_size, abp
but where is mapped source memory (pointed by dtb_ptr in memmove())????

While I agree with copy of fdt blob, doing useless full copy of initrd (hundreds of MB or ones of GB) is hard to understand and imo unacceptable).

Okay. My understanding is we need to find a way to avoid doing memmove of initrd image from PA into KVA. I'll try to explore how to achieve this. I see you have mentioned about arm_physmem_exclude_regions(). I understand that this and mapping FDT blob loaded anywhere in PA needs to be achieved. Do you see any other shortcomings?

My proposal is:

  1. Take back original locore.S from D13861. This ensures that DTB blob is mapped properly and it removes necessarily of duplication of mapping functions.
  2. take back command line guard from machdep_boot.c
  3. rewrite initrd processing- we should reserve memory in early boot phase (in linux_load_initrd()), then map it later when pmap is fully bootstraped. And because initrd blob is mapped permanently, you must map it precisely, at page boundary - overmapping is not allowed there (for arm64, you must ensure that multiple mapped pagehave same cache/type attributes).

imo, you can use arm_physmem_exclude_regions() for reservation of initrd PA range, and later use pmap_kenter() to properly map it.

Moreover how you can guarantee that original location doesn't overlap with new( memmove() itself doesn’t help as we talking about virtual address ranges? Why you think that we have enough free memory after kernel for new copy of initrd. booti guarantees only few pages are available after kernel load location.

It can overlap. My understanding is to potentially avoid the overlap, booti therefore, by default relocates initrd into highmem region.

Nope, it's not guaranteed. See this https://www.kernel.org/doc/Documentation/arm64/booting.txt
"The Image must be placed text_offset bytes from a 2MB aligned base
address anywhere in usable system RAM and called there. The region
between the 2 MB aligned base address and the start of the image has no
special significance to the kernel, and may be used for other purposes.
At least image_size bytes from the start of the image must be free for
use by the kernel."

For ex: Actual initrd load addr is 0x98000000 but on 4GB RAM, it is relocated to higher RAM location starting at 0xe6468000,
uboot# booti 0x90000000 0x98000000:$filesize 0x88000000

Flattened Device Tree blob at 88000000

Booting using the fdt blob at 0x88000000
Loading Ramdisk to e6468000, end feb0ae00 ... OK <========
reserving fdt memory region: addr=17e000000 size=2000000
reserving fdt memory region: addr=81000000 size=200000
Using Device Tree in place at 0000000088000000, end 00000000880095fb

Because of 2) – all 3 booti blobs must be taken separately. I don’t see single reason for postponing mapping of FDT to C code.

We have to handle initrd image in C code after loading and parsing FDT blob.

Original locoer.S works and

I would think that checking validity of FDT blob and mapping it in locore.S would make locore.S more complex. Moving it to C code would be good per my understanding.
I would also think that we just need to distinguish loader Vs U-Boot in locore.S and try to do further booti handling in C code.

build_l2_block_pagetable() is unusable.

Not sure I understand this. Please see [1].

Using lastaddr for reservation of memory used by initrd is clearly nonsense.

In your patch and machdep_boot.c for arm, lastaddr is value of KERNBASE+(kernel size). dtb is copied into KVA at [ KERNBASE+(kernel size) ].
I have copied initrd image at [ KERNBASE+(kernel size) + (dtb size) ] .

Initrd blob should be reserved by using (probably) arm_physmem_exclude_regions() and map it later , when pmap is fully bootstrapped.

Please pardon my ignorance. It could be done this way but do you see a serious problem with the above implementation? [2]

Is above description enough?

My understanding is we need to find a way to avoid doing memmove of initrd image from PA into KVA.

See above.

Also do you have an idea how to fix now broken booti image generation( because this is something which i'm unable to solve)?

Can you please give me more information on what is broken?

I wrote it previously. Actual approach for booti image generation (do second link) keeps kernel, kernel.bin, kernel.full and kernel .debug out of sync. So it' not possible to debug or dtrace kernel loaded by booti. For me is this very serious problem, it’s a real reason why I not committed D13861.

If your DTB is not in range of physical kernel load address aligned to 1GiB then it will not be mapped by this patch. (And I understand that we need to fix this and create PA=VA map for dtb PA as well).

So I think that this patch needs significant rewrite. Did I forget anything? Do you want more details for something? What’s next?

Please see [2].

Other than this the salient features of the patch are,

  1. Support both boot from loader and u-boot when LINUX_BOOT_ABI is set (this option can be enabled by default).
  2. Support initrd image option of booti (vfs.root.mountfrom need to set in U-Boot bootargs env variable appropriately).
  3. "go" command should also work (with fdt built into kernel)

Nice, so you added support for initrd to original patch. :)