Page MenuHomeFreeBSD

biosboot: Detect memory disks from PXE
ClosedPublic

Authored by imp on May 30 2024, 1:35 AM.
Tags
None
Referenced Files
Unknown Object (File)
Feb 2 2026, 11:47 AM
Unknown Object (File)
Jan 19 2026, 5:45 AM
Unknown Object (File)
Dec 26 2025, 6:01 AM
Unknown Object (File)
Nov 29 2025, 6:51 AM
Unknown Object (File)
Nov 18 2025, 9:28 PM
Unknown Object (File)
Nov 3 2025, 11:10 PM
Unknown Object (File)
Oct 31 2025, 10:12 AM
Unknown Object (File)
Oct 27 2025, 4:12 AM

Details

Summary

Walk through the disk driver entries chained off of INT13.

MEMDISK is part of the Syslinux project; it loads disk images into
memory, sets an int 13h hook and then does a BIOS boot from the image;
this can be used as part of a PXE boot environment to load installer
disks, however the disks are not accessible from inside the FreeBSD
kernel because it doesn't access disks through BIOS APIs.

This patch detects the disk images in the loader, and passes their
address and length as a driver hint. When the md driver sees the hint,
it maps the image, and presents it to the system.

Test Plan

I'm most confident about the boot loader changes (though the arbitrary limit likely needs some help)
I need review and ideally testing of the md.
Also, I'm not at all sure that we reserve the area, so maybe kernel allocations overwrite the ram disk? How do I prevent that?

To test this out (once in the tree, freebsd-bootable-image.img could be mini-rootdisk.img):
% sudo pkg install ipxe qemu
% mkdir testdir
% cd testdir
% fetch https://bapt.nours.eu/memdisk
% cp ~/mumble/freebsd-bootable-image.img .
% cat freebsd.ixp < EOF
#!ipxe
initrd tftp://10.0.2.2/freebsd-bootable-image.img
chain tftp://10.0.2.2/memdisk harddisk raw
EOF
% qemu-system-x86_64 -boot n -m 4g -cdrom /usr/local/share/ipxe/ipxe.iso -device virtio-net,netdev=n1 -netdev user,id=n1,tftp=$(pwd),bootfile=/freebsd.ipxe

(thanks to bapt for that recipe)

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 65619
Build 62502: arc lint + arc unit

Event Timeline

imp requested review of this revision.May 30 2024, 1:35 AM

Thanks for picking this up!

I'm most confident about the boot loader changes (though the arbitrary limit likely needs some help)

I don't remember where that 32 came from; I think I just wanted to have a limit, so the loop would be be bounded, even if data was

Also, I'm not at all sure that we reserve the area, so maybe kernel allocations overwrite the ram disk? How do I prevent that?

memdisk adjusts the e820 memory map, so the memory it uses for the disk image will not be available memory for the kernel, but it seems like a good idea to somehow ensure that.

stand/i386/libi386/biosmemdisk.c
64

I have been testing this patch, but it remains stuck in the loader: it prints everything, but I cannot hit any key, the timer is not running.

I have been testing this patch, but it remains stuck in the loader: it prints everything, but I cannot hit any key, the timer is not running.

So we're hitting an infinite loop before interact()... That's going to be fun to debug... Is this with or without ram disks configured for iPXE?

this is with ramdisk configured via ipxe

After more testing:

INT 13 08: Failure, assuming this is the only drive^M
Drive probing gives drive shift limit: 0x81^M
old: int13 = f000e3fe  int15 = f000f859  int1e = f000607c^M
new: int13 = 9b80000a  int15 = 9b8003ba  int1e = f000607c^M
Loading boot sector... booting...^M

I can see the kernel booting but it fails at mounting the root with error 19
in the mountfrom prompt I get:

 mountroot> random: unblocking device.
? 

List of GEOM managed disk devices:
      iso9660/iPXE cd0

mountroot>

Indeed, there is the question of whether the mapped memory could be used by the VM. russor_ruka.org implies it should not as it is removed from the BIOS SMAP.

I'm not aware of a common facility to exclude physical memory regions early, or one to do that after vm_page_startup() starts (and bases its actions upon phys_avail[] and vm_phys_early_segs[]). There are two ways finally leading to filling phys_avail[]: amd64 & i386 use physmap[] (see add_physmap_entry()/bios_add_smap_entries()), whereas arm64 (and riscv IIRC) use the facilities in subr_physmem.c (there is a physmem_exclude_region() function). If limited to amd64, this would have to go into machdep.c. I'm not sure if it's possible to query hints that early.

stand/i386/libi386/biosmemdisk.c
60–62

(minor)

  • Switch C++ comments to C ones, as some other "VERY important" (see style(9)) single-line comments below use original C syntax?
  • Cosmetic: Would write 0x4C as 0x13 * 4.
90–97

(minor) This assignment feels out of place. Move it close to the checksum computation?

Indeed, there is the question of whether the mapped memory could be used by the VM. russor_ruka.org implies it should not as it is removed from the BIOS SMAP.

I'm not aware of a common facility to exclude physical memory regions early, or one to do that after vm_page_startup() starts (and bases its actions upon phys_avail[] and vm_phys_early_segs[]). There are two ways finally leading to filling phys_avail[]: amd64 & i386 use physmap[] (see add_physmap_entry()/bios_add_smap_entries()), whereas arm64 (and riscv IIRC) use the facilities in subr_physmem.c (there is a physmem_exclude_region() function). If limited to amd64, this would have to go into machdep.c. I'm not sure if it's possible to query hints that early.

This is only possible on x86 BIOS systems, so only i386 and amd64 systems only. I have this on my radar to test / fix / audit since it's a boot-loader thing and things passed from the boot loader are special. There's other methods for passing ram disks from other setups (like initrd for linux boot / UEFI booting). I have these changes queued and in my list to commit.

I'm unsure about it being removed from BIOS SMAP. On the one hand, we'd have to remove it from the avail map if it is in the SMAP. On the other, it just works as if this were treated as a "device with memory" by the OS. So I'm thinking yes, it can be used by the VM, just like we can use other memory mapped items. It's on my list of things to check into more deeply before committing.

stand/i386/libi386/biosmemdisk.c
67

I was planning on making this a #define and define it to more like 2 or 4. Though it doesn't hurt to iterate a few times.

76

I was considering making this 'bootverbose'

kib added inline comments.
stand/i386/libi386/biosmemdisk.c
50
61

So this is for BIOS boot only? There is no equivalent for UEFI boot method?

70
73

No need to have this comment multi-line

sys/dev/md/md.c
2139

We must start support unmapped preloaded images. I will provide the patch.

2141

I proposed the patch to support unmapped preloaded images in D51128, untested.
This patch would require a trivial adaptation, in particular, remove the pmap_map() line, and pass paddr as the second arg to md_preloaded():

md_preloaded(0, paddr, len, scratch, false);
In D45404#1166918, @kib wrote:

I proposed the patch to support unmapped preloaded images in D51128, untested.
This patch would require a trivial adaptation, in particular, remove the pmap_map() line, and pass paddr as the second arg to md_preloaded():

md_preloaded(0, paddr, len, scratch, false);

OK. Should I land both with you as the author of the other review? Or how can we logistically handle this?

sys/dev/md/md.c
2139

so with your patches, we can support them? That's what it looks like

In D45404#1168886, @imp wrote:
In D45404#1166918, @kib wrote:

I proposed the patch to support unmapped preloaded images in D51128, untested.
This patch would require a trivial adaptation, in particular, remove the pmap_map() line, and pass paddr as the second arg to md_preloaded():

md_preloaded(0, paddr, len, scratch, false);

OK. Should I land both with you as the author of the other review? Or how can we logistically handle this?

This review needs fixes, unrelated to the unmapped feature. You can push it as is after the fixes are done.
Then I can provide the full patch for unmapped preload disks, with the adjustments I described above, but I cannot test it.

sys/dev/md/md.c
2139

Yes.

imp edited the test plan for this revision. (Show Details)

update for all the review comments
Add some comments based on delving into where this is defined.
Added testing info to review.
Does not have kib's changes (I've download them and have them in my testing tree)

This revision was not accepted when it landed; it landed in state Needs Review.Jul 23 2025, 12:11 AM
This revision was automatically updated to reflect the committed changes.

I'd potentially like to extend this change to support the "cmdline" field that memdisk passes in the ACPI-like table segment, to support Cobbler PXE bootstrap. I'll raise a separate Differential at that time.

In D45404#1262069, @bms wrote:

I'd potentially like to extend this change to support the "cmdline" field that memdisk passes in the ACPI-like table segment, to support Cobbler PXE bootstrap. I'll raise a separate Differential at that time.

Sure, while the world is moving beyond bios booting, there are still niches where uefi has trouble penetrating...

stand/i386/libi386/biosmemdisk.c
61

UEFI specifies a way to do this, but it would be completely different. It wouldn't find the chain of drivers from the INT13 interrupt handler to chain to... It would find it in the SYSTEM_TABLES with a certain UUID IIRC. 90% of this is 'find it and validate' so I'm not sure we could repurpose this code for UEFI... I've thought about doing something similar to how Linux can find its initrd via a UEFI driver with a certain UUID for FreeBSD, but haven't made it work yet for Linux in my test setup.

stand/i386/libi386/biosmemdisk.c
61

For UEFI, I did some work at the end of 2025 and early 2026. The least effort way forward is to use the UEFI RamDiskProtocol which allows the image to be mounted and while in UEFI boot services and to also register the memory range as a NVDIMM.

Then existing nvdimm drivers will work to access the ramdisk. FreeBSD has nvdimm(4), although it's not enabled in GENERIC I think it may only be tested on amd64?. Linux calls their driver pmem. Windows has a NVDIMM driver, but not for simple ramdisks (oh well?).

I put together what I call memdisk_uefi that works similarly to syslinux's memdisk;

https://github.com/russor/memdisk_uefi

It uses an iPXE api to download the image, and it could be improved in many ways, but it's something :)

For FreeBSD, you can boot an unmodified installer image, but you need to manually load the nvdimm module (I tested with 15.0, but the driver was added in 12.0, and I'd expect anything after then would work). For Linux, it just worked with the image I had on hand (ubuntu-24.04.1-live-server-amd64.iso), but I'd imagine some images may not have the pmem driver available, but it was added in Linux 4.2 (2015), so it's just a matter of what was enabled in the installer kernel.

stand/i386/libi386/biosmemdisk.c
61

nvdimm can be loaded as module.

The issue with nvdimm(4) is that it requires a lot from the ACPI tables to find the image. If you fabricate that, it must not break configurations where real nvdimms with corresponding bios-provided NFIT exist.

I do not even quite see how to patch existing NFIT to add ramdisk. Plus nvdimm(4) expects that nvdimms units are enumerated as ACPI devices.

stand/i386/libi386/biosmemdisk.c
61

There is code in edk2 that patches a NFIT, but I didn't include that in Memdisk_uefi because it relies on UEFI firmware services; afaik, UEFI boot services doesn't make it easy to find an existing table to patch it.

Here's my code which just assumes there's no existing NFIT table https://github.com/russor/memdisk_uefi/blob/main/from_edk.c#L49

Here's edk2's code where they take an existing NFIT table and extend it to add another entry: https://github.com/tianocore/edk2/blob/f0542ae07d5e5e4d311bf8ae33bf26b4b1acf9f4/MdeModulePkg/Universal/Disk/RamDiskDxe/RamDiskProtocol.c#L239

There is an ACPI device added as well, but it's very simple:

https://github.com/tianocore/edk2/blob/master/MdeModulePkg/Universal/Disk/RamDiskDxe/RamDisk.asl

This is something the edk2 code also does:

https://github.com/tianocore/edk2/blob/f0542ae07d5e5e4d311bf8ae33bf26b4b1acf9f4/MdeModulePkg/Universal/Disk/RamDiskDxe/RamDiskProtocol.c#L65

stand/i386/libi386/biosmemdisk.c
61

So, tables are relatively easy to patch in the boot loader, but I'm uneasy about that.

I'd rather strongly prefer, though, we don't tie this feature to the nvdimm driver. Linux has its own protocols to pass its initrd from EFI to the kernel. Linux does this because nvdimm has alot of extra baggage. However, maybe there's something here I'm not understanding. Why is this a good idea, other than we have a bit of code in the nvdimm driver that can find it?

I am following this Differential closely because: following Ch6 of my PhD regarding the under-published ILNPSync (a distributed site border router state exchange protocol somewhat patterned after pfsync(4)), I want to implement a change to introduce the use of the UEFI MonotonicCounter.

This belongs in a new Differential, though candidate reviewers appear to be here.

The MonotonicCounter would be optionally fed into network stack originated HMACs during system boot, to defeat replay attacks. net-snmp currently implements something close (as per PhD footnote) by persisting its snmpEngineBoots OID to a disk file; close but no cigar, as it is not accessible during boot.

Loader and kernel support will be needed, as both Boot Services and Runtime Services need to be invoked with separate calls.

I have observed that EDK2 derived EFI firmware generally stores the counter's contents as an EFI Global Variable, however this is not guaranteed. I have a pair of Dell Optiplex 3060s to experiment with, and of course Microsoft Hyper-V, which uses Project Mu derived EFI emulation.

In D45404#1262412, @bms wrote:

... the under-published ILNPSync (a distributed site border router state exchange protocol somewhat patterned after pfsync(4))...

That's a whole other project BTW. dSBR-ILNPv6 needs an utter rethink, now that IPv6 Segment Routing (SRv6) is a thing.

This is potentially semi-commercial work which I'm surprised Juniper have not already contributed towards, given Linux has an SRv6 data plane already.

stand/i386/libi386/biosmemdisk.c
61

Well I think it's a convenient idea because it allows an unmodified installer image to be used.

From what I've seen (which is certainly limited), netboot for installers on BIOS tends towards memdisk + a cd/usb image. The image runs its bootloader with BIOS hooks and then a little bit of magic (this diff) finds the memdisk image for the OS to load when the BIOS hooks go away.

memdisk_uefi allows the same experience, using the nvdimm table for the magic to find the disk, because it feels like cross platform magic that's already there (and is used by edk2's UEFI memory disk when it happens in early firmware, before boot services -- I believe some firmware packages allow loading a disk image to boot from in their user interface, although I don't think i have access to any that do); it would just need the nvdimm driver to be compiled into the kernel for installer images; and Linux pxe setups could use it too; although since nobody else put together a memdisk_uefi, maybe everyone is happy booting Linux with kernel + initrd.

Using kernel + initrd means when you're setting up the PXE server you can't just download one image, you have to download two files, or download the iso and pull out the kernel/bootloader.efi to run. There's a chance you update the initrd but not the kernel, which might not work depending on what's going on with syscalls. If you use memdisk for BIOS and kernel+initrd for UEFI, that seems easy to update one but not the other.

stand/i386/libi386/biosmemdisk.c
61

One other thing I wanted to add is that loading the bootloader from the image provides the full bootloader experience -- if you need to add a kernel parameter or load a module, you can easily do so; it's difficult to recreate everything you can do ad-hoc in FreeBSD's loader in ipxe.

Also if the user has a setup where the bootloader won't work, but the kernel loaded by pxe does work, it might be less confusing if the installer doesn't start than if the installer works but the installed system doesn't boot. There's certainly some amount of that anyway if the user installs onto a disk that their firmware won't boot from.

stand/i386/libi386/biosmemdisk.c
61

Several things:
(1) It's already possible to build loader.efi with built-in ram disk that's used by both the loader and the kernel. The handoff is trivial and doesn't require any extra code. So we already can build this experience, I'm not sure what needs to be added. There's no need for there to be any support in the UEFI loader to make this happen. We could have better tooling to build this from only release artifacts, but I've used this on several machines. It's not super well tested or promoted, but we can fix that. It's one loader.efi that you can load via ipxe or off the ESP. You get the full loader experience (in addition to the option of booting off a disk image if you want).
(2) The kernel itself allows adding a ram disk for booting, though the loader doesn't have access to this ram disk. This path has been used for a long time, dating back to the earliest days of the project w/o needing loader support.

Maybe I don't have enough experience with the UEFI specific thing, nor what it enables beyond these two use cases.

stand/i386/libi386/biosmemdisk.c
61

So, tables are relatively easy to patch in the boot loader, but I'm uneasy about that.

I'd rather strongly prefer, though, we don't tie this feature to the nvdimm driver.

Strongly agree.
It might work now because the nvdimm(4) driver is relatively unadvanced. It does not use the features like block window or persistence. The moment it starts doing so, this construct would break.

stand/i386/libi386/biosmemdisk.c
61

Yea, loader.efi is either going to have to hack the tables to pass in the ram disk, or pass the ram disk in via the normal metadata it already passes in. Either way, the loader will need a disk driver so it can read bits of the file. But I'm struggling to understand why passing it in via the path normally used for nvdimm is a win. Either way, the kernel needn't change, and the kernel would get the right root filesystem.

stand/i386/libi386/biosmemdisk.c
61

Ok, so... using NVDIMM for a virtual disk / virtual CD is part of the NVDIMM spec. There's UUIDs for it. It's part of the UEFI Ramdisk spec and the edk2 reference implementation. Patching it into the tables at boot services instead of during platform initialization is probably not really within the spec. I don't know much about real NVDIMM hardware, and if any of it only has the minimal interface like the ramdisk implementation. I believe the Windows NVDIMM driver does expect a more featured nvdimm which may be what's preventing mounting such a RamDisk in Windows (you can boot a Windows installer image, but once the kernel is loaded, it can't access the RamDisk)

Anyway, you two don't seem comfortable with it, I'll stop pushing on it after this message. It's just a very minimal change to FreeBSD (to enable the driver in the installer kernel) and it works option. I'd love to be able to easily netboot an installer image when I need to, rather than fiddling with USB sticks; I don't want to deal with booting from NFS, that's a fun rabbithole for another time :D

With syslinux's memdisk, my memdisk_uefi.elf, and a freebsd ISO in the same (http) path as the menu

An ipxe config for BIOS looks like:

initrd FreeBSD-15.0-RELEASE-amd64-bootonly.iso
chain memdisk raw

a pxe config for UEFI looks like

boot memdisk_uefi.elf ${cwduri}FreeBSD-15.0-RELEASE-amd64-bootonly.iso

and it works with the published installer image (other than you have to manually load the nvdimm module because it's not built into the kernel); it works with the mfsBSD images published on https://mfsbsd.vx.sk/ which otherwise were hard to pxeboot with UEFI --- those don't need the NVDIMM stuff, because they load the ramdisk off the (firmware mounted) filesystem from the loader.

memdisk_uefi is outside of FreeBSD, just like syslinux's memdisk.

I am interested in a pointer to building a loader.efi with an embedded ramdisk? That seems interesting too, and could maybe be a single file to tell ipxe to boot?

stand/i386/libi386/biosmemdisk.c
61

There's a few different UUIDs defined in UEFI for different kinds of ram disks. I'm not entirely opposed to this, but I don't want it to be our preferred route. And I really don't want to go through the nvdimm driver if we can avoid it... I know other OSes do, but I really dislike it.. But it sounds like there may be ways forward to get the functionality via a slightly different vector.

So this is potentially very interesting.

stand/i386/libi386/biosmemdisk.c
61

Also, check out sys/tools/embed_mfs.sh