Page MenuHomeFreeBSD

biosboot: Detect memory disks from PXE
ClosedPublic

Authored by imp on May 30 2024, 1:35 AM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Aug 4, 8:08 AM
Unknown Object (File)
Mon, Jul 28, 9:41 AM
Unknown Object (File)
Mon, Jul 28, 6:18 AM
Unknown Object (File)
Mon, Jul 28, 5:23 AM
Unknown Object (File)
Mon, Jul 28, 3:19 AM
Unknown Object (File)
Mon, Jul 28, 12:48 AM
Unknown Object (File)
Sat, Jul 26, 3:57 PM
Unknown Object (File)
Sat, Jul 26, 3:50 PM

Details

Summary

Walk through the disk driver entries chained off of INT13.

MEMDISK is part of the Syslinux project; it loads disk images into
memory, sets an int 13h hook and then does a BIOS boot from the image;
this can be used as part of a PXE boot environment to load installer
disks, however the disks are not accessible from inside the FreeBSD
kernel because it doesn't access disks through BIOS APIs.

This patch detects the disk images in the loader, and passes their
address and length as a driver hint. When the md driver sees the hint,
it maps the image, and presents it to the system.

Test Plan

I'm most confident about the boot loader changes (though the arbitrary limit likely needs some help)
I need review and ideally testing of the md.
Also, I'm not at all sure that we reserve the area, so maybe kernel allocations overwrite the ram disk? How do I prevent that?

To test this out (once in the tree, freebsd-bootable-image.img could be mini-rootdisk.img):
% sudo pkg install ipxe qemu
% mkdir testdir
% cd testdir
% fetch https://bapt.nours.eu/memdisk
% cp ~/mumble/freebsd-bootable-image.img .
% cat freebsd.ixp < EOF
#!ipxe
initrd tftp://10.0.2.2/freebsd-bootable-image.img
chain tftp://10.0.2.2/memdisk harddisk raw
EOF
% qemu-system-x86_64 -boot n -m 4g -cdrom /usr/local/share/ipxe/ipxe.iso -device virtio-net,netdev=n1 -netdev user,id=n1,tftp=$(pwd),bootfile=/freebsd.ipxe

(thanks to bapt for that recipe)

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 57960
Build 54848: arc lint + arc unit

Event Timeline

imp requested review of this revision.May 30 2024, 1:35 AM

Thanks for picking this up!

I'm most confident about the boot loader changes (though the arbitrary limit likely needs some help)

I don't remember where that 32 came from; I think I just wanted to have a limit, so the loop would be be bounded, even if data was

Also, I'm not at all sure that we reserve the area, so maybe kernel allocations overwrite the ram disk? How do I prevent that?

memdisk adjusts the e820 memory map, so the memory it uses for the disk image will not be available memory for the kernel, but it seems like a good idea to somehow ensure that.

stand/i386/libi386/biosmemdisk.c
63

I have been testing this patch, but it remains stuck in the loader: it prints everything, but I cannot hit any key, the timer is not running.

I have been testing this patch, but it remains stuck in the loader: it prints everything, but I cannot hit any key, the timer is not running.

So we're hitting an infinite loop before interact()... That's going to be fun to debug... Is this with or without ram disks configured for iPXE?

this is with ramdisk configured via ipxe

After more testing:

INT 13 08: Failure, assuming this is the only drive^M
Drive probing gives drive shift limit: 0x81^M
old: int13 = f000e3fe  int15 = f000f859  int1e = f000607c^M
new: int13 = 9b80000a  int15 = 9b8003ba  int1e = f000607c^M
Loading boot sector... booting...^M

I can see the kernel booting but it fails at mounting the root with error 19
in the mountfrom prompt I get:

 mountroot> random: unblocking device.
? 

List of GEOM managed disk devices:
      iso9660/iPXE cd0

mountroot>

Indeed, there is the question of whether the mapped memory could be used by the VM. russor_ruka.org implies it should not as it is removed from the BIOS SMAP.

I'm not aware of a common facility to exclude physical memory regions early, or one to do that after vm_page_startup() starts (and bases its actions upon phys_avail[] and vm_phys_early_segs[]). There are two ways finally leading to filling phys_avail[]: amd64 & i386 use physmap[] (see add_physmap_entry()/bios_add_smap_entries()), whereas arm64 (and riscv IIRC) use the facilities in subr_physmem.c (there is a physmem_exclude_region() function). If limited to amd64, this would have to go into machdep.c. I'm not sure if it's possible to query hints that early.

stand/i386/libi386/biosmemdisk.c
59–61

(minor)

  • Switch C++ comments to C ones, as some other "VERY important" (see style(9)) single-line comments below use original C syntax?
  • Cosmetic: Would write 0x4C as 0x13 * 4.
89–96

(minor) This assignment feels out of place. Move it close to the checksum computation?

Indeed, there is the question of whether the mapped memory could be used by the VM. russor_ruka.org implies it should not as it is removed from the BIOS SMAP.

I'm not aware of a common facility to exclude physical memory regions early, or one to do that after vm_page_startup() starts (and bases its actions upon phys_avail[] and vm_phys_early_segs[]). There are two ways finally leading to filling phys_avail[]: amd64 & i386 use physmap[] (see add_physmap_entry()/bios_add_smap_entries()), whereas arm64 (and riscv IIRC) use the facilities in subr_physmem.c (there is a physmem_exclude_region() function). If limited to amd64, this would have to go into machdep.c. I'm not sure if it's possible to query hints that early.

This is only possible on x86 BIOS systems, so only i386 and amd64 systems only. I have this on my radar to test / fix / audit since it's a boot-loader thing and things passed from the boot loader are special. There's other methods for passing ram disks from other setups (like initrd for linux boot / UEFI booting). I have these changes queued and in my list to commit.

I'm unsure about it being removed from BIOS SMAP. On the one hand, we'd have to remove it from the avail map if it is in the SMAP. On the other, it just works as if this were treated as a "device with memory" by the OS. So I'm thinking yes, it can be used by the VM, just like we can use other memory mapped items. It's on my list of things to check into more deeply before committing.

stand/i386/libi386/biosmemdisk.c
66

I was planning on making this a #define and define it to more like 2 or 4. Though it doesn't hurt to iterate a few times.

75

I was considering making this 'bootverbose'

kib added inline comments.
stand/i386/libi386/biosmemdisk.c
49
60

So this is for BIOS boot only? There is no equivalent for UEFI boot method?

69
72

No need to have this comment multi-line

sys/dev/md/md.c
2094

We must start support unmapped preloaded images. I will provide the patch.

2096

I proposed the patch to support unmapped preloaded images in D51128, untested.
This patch would require a trivial adaptation, in particular, remove the pmap_map() line, and pass paddr as the second arg to md_preloaded():

md_preloaded(0, paddr, len, scratch, false);
In D45404#1166918, @kib wrote:

I proposed the patch to support unmapped preloaded images in D51128, untested.
This patch would require a trivial adaptation, in particular, remove the pmap_map() line, and pass paddr as the second arg to md_preloaded():

md_preloaded(0, paddr, len, scratch, false);

OK. Should I land both with you as the author of the other review? Or how can we logistically handle this?

sys/dev/md/md.c
2094

so with your patches, we can support them? That's what it looks like

In D45404#1168886, @imp wrote:
In D45404#1166918, @kib wrote:

I proposed the patch to support unmapped preloaded images in D51128, untested.
This patch would require a trivial adaptation, in particular, remove the pmap_map() line, and pass paddr as the second arg to md_preloaded():

md_preloaded(0, paddr, len, scratch, false);

OK. Should I land both with you as the author of the other review? Or how can we logistically handle this?

This review needs fixes, unrelated to the unmapped feature. You can push it as is after the fixes are done.
Then I can provide the full patch for unmapped preload disks, with the adjustments I described above, but I cannot test it.

sys/dev/md/md.c
2094

Yes.

imp edited the test plan for this revision. (Show Details)

update for all the review comments
Add some comments based on delving into where this is defined.
Added testing info to review.
Does not have kib's changes (I've download them and have them in my testing tree)

This revision was not accepted when it landed; it landed in state Needs Review.Wed, Jul 23, 12:11 AM
This revision was automatically updated to reflect the committed changes.