Page MenuHomeFreeBSD

EFI boot1 and loader Refactoring
AbandonedPublic

Authored by eric_metricspace.net on Aug 21 2016, 12:10 PM.
Tags
None
Referenced Files
F101509587: D7589.diff
Wed, Oct 30, 6:19 PM
Unknown Object (File)
Fri, Oct 25, 1:49 AM
Unknown Object (File)
Fri, Oct 25, 1:49 AM
Unknown Object (File)
Fri, Oct 25, 1:49 AM
Unknown Object (File)
Fri, Oct 25, 1:49 AM
Unknown Object (File)
Fri, Oct 25, 1:48 AM
Unknown Object (File)
Fri, Oct 25, 1:48 AM
Unknown Object (File)
Fri, Oct 25, 1:47 AM

Details

Summary

This patch refactors the EFI boot1 and loader components, bringing them more in line with the EFI API. A summary of changes follows:

  • An efifs driver is provided which translates the EFI_SIMPLE_FILE_SYSTEM interface into the fs_ops interface. This replaces all existing fs_ops drivers in loader
  • The sys/boot/efi/drivers directory is set up to contain files that provide drivers using the EFI API interfaces. The efipart driver is moved from loader into this directory.
  • A dual to the efifs driver which translates the fs_ops interface to the EFI_SIMPLE_FILE_SYSTEM interface is provided. This provides support for all existing filesystems through the EFI_SIMPLE_FILE_SYSTEM interface.
  • boot1 is converted to use libefi, install the EFI drivers from sys/boot/efi/drivers, and use the EFI_SIMPLE_FILE_SYSTEM interface to load loader.efi. Most of the existing boot1 code including all the boot modules are discarded as they are replaced by the code in sys/boot/efi/drivers.
  • Some refactoring of efipart has taken place to make it more closely resemble the EFI driver model. However, EFI drivers support dynamic discovery, and this is currently incompatible with the design of the bcache system, thus it could not be completely converted to the EFI driver model.

This patch is the first component of EFI GELI support and provides necessary refactoring for that work. However, the patch has a rationale as a standalone patch:

  • Unification of the boot1 and loader drivers eliminates the need to support two sets of drivers.
  • The complexity of boot1 is greatly reduced.
  • Drivers are initialized and installed on device handles only once during the execution of both of boot1 and loader. This has benefits now, but becomes particularly important with regard to GELI, where initialization implies asking for passwords.
  • Both boot1 and loader now use the EFI_SIMPLE_FILE_SYSTEM interface to load files. This means they can make use of drivers installed by GRUB, coreboot, and other such loaders.
  • It should be fairly straightforward to export the existing filesystem drivers as EFI drivers, thereby providing implementations to GRUB and other EFI-based loaders.
  • Having loader use the EFI driver-based model provides the foundation for hot-pluggable device support. Note that this is not fully complete, as refactoring of bcache to support dynamic device addition/removal is necessary for this to work.
Test Plan

This patch should be behavior-neutral, as it is a refactoring. The boot1.efi and loader.efi binaries produced by this patch have been in active use by me for some time now, and have been tested with various combinations of the old and new boot1 and loader.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
sys/boot/efi/boot1/boot1.c
102

message is not consistent with function name.

116

message is not consistent with function name.

123

message is not consistent with function name.

  1. the partX devices are absolutely confusing for user - you have no way to even guess which devices are actually used, also inconsistent with bios version...

It has by far caused us no end of pain. What's part10:? Our systems have 4 drives with 9 partitions on each one (except the ones that have 2, surprise!). The ordering is absolutely random as well, it seems. Some systems get part10 being the boot disk, others get part 2. Totally nuts.

There isn't a need to clean up memory after boot1, and in fact, that will probably break things. Once an EFI driver is initialized and attached to handles, it stays around until ExitBootServices or until the driver is unloaded (which currently isn't done ever).

boot1 and loader both attempt to attach their drivers to each handle, but at most one of them (usually boot1) will succeed. The only reason loader does a driver initialization/attach pass is to handle the case where GRUB is loading loader directly, in fact. If you know for a fact you're using boot1, you could compile those bits out of loader completely.

The GELI patch actually needs this to be the case, as it's how the patch ensures that the user only ever gets asked for a password once.

lib/libstand/stand.h
27

There is a non-whitespace change. See addition of DEVT_EFI

RE: device names- There are options here. EFI has a protocol that can be used to attach an arbitrary string as a description. It wouldn't be too hard to print that name out if it's present on a handle. You could generate those names from a number of sources, in fact: filesystem labels, ZFS UUID's, or EFI device paths as a fallback mechanism.

The preferred stuff probably got squashed in a merge. A question worth asking: should I reinstate it as-is, or is there any minor improvement that could be done opportunistically?

sys/boot/efi/boot1/boot1.c
497

This probably got squashed by a merge. I will look into restoring it.

In D7589#157710, @imp wrote:

It has by far caused us no end of pain. What's part10:? Our systems have 4 drives with 9 partitions on each one (except the ones that have 2, surprise!). The ordering is absolutely random as well, it seems. Some systems get part10 being the boot disk, others get part 2. Totally nuts.

I would like it if loader.efi understood partitions. @manu looked into it at BSDCan, however from my understanding, he found it was easier to teach U-Boot to expose partitions to loader.efi than add the existing GPT/MBR code to loader.efi.

If someone was to get it working I would imagine we could rename the partitions to be something like disk0p0 or disk1s3. The device name may be different than the kernel, however the partition id should be the same.

In D7589#157710, @imp wrote:

It has by far caused us no end of pain. What's part10:? Our systems have 4 drives with 9 partitions on each one (except the ones that have 2, surprise!). The ordering is absolutely random as well, it seems. Some systems get part10 being the boot disk, others get part 2. Totally nuts.

I would like it if loader.efi understood partitions. @manu looked into it at BSDCan, however from my understanding, he found it was easier to teach U-Boot to expose partitions to loader.efi than add the existing GPT/MBR code to loader.efi.

I spent some quality time with that code over several weeks to try to get nextboot working with the FreeBSD extensions to GPT and gave up. UEFI just doesn't export the data needed, and reinterpreting things via the raw view of the disk it exports was troublesome... That's why I implemented a kludge in forth to add one to the part name when certain files exist in /boot to kinda implement nextboot / active partition stuff.

If someone was to get it working I would imagine we could rename the partitions to be something like disk0p0 or disk1s3. The device name may be different than the kernel, however the partition id should be the same.

Yes. That would be nice. However we need to be a little careful here. This doesn't solve the random disk ordering issues that I've seen that lead to my rant about WTF is part10... See, my original FORTH code was stupid and would increment to the next partition and come up with part20...

In D7589#157710, @imp wrote:

It has by far caused us no end of pain. What's part10:? Our systems have 4 drives with 9 partitions on each one (except the ones that have 2, surprise!). The ordering is absolutely random as well, it seems. Some systems get part10 being the boot disk, others get part 2. Totally nuts.

I would like it if loader.efi understood partitions. @manu looked into it at BSDCan, however from my understanding, he found it was easier to teach U-Boot to expose partitions to loader.efi than add the existing GPT/MBR code to loader.efi.

If someone was to get it working I would imagine we could rename the partitions to be something like disk0p0 or disk1s3. The device name may be different than the kernel, however the partition id should be the same.

I think that the problem I got during bsdcan was boot1 unable to work well with unaligned partition so I would need to find time to look at this again (because of course I don't have the patches anymore).

On the review itself I'm a bit concerned about what this would mean for U-Boot.
I admit that I haven't read all the code but it seems that with this patch the UEFI implementation would need to support EFI_SIMPLE_FILE_SYSTEM (which U-Boot doesn't).
This is true that arm/arm64 aren't Tier-1 arch like AMD64 and that one UEFI implementation is required to support this protocol (from my understanding of the spec) but all the effort I've put on running boot1.efi/loader.efi on arm/arm64 (because this is the right way to take) would be useless if this is the case.

Erik: Can you confirm or not that EFI_SIMPLE_FILE_SYSTEM is needed ?
If this is the case we would need a fallback on what boot1/loader is currently doing.
Thanks

sys/boot/efi/boot1/boot1.c
423–427

We repeat this code over and over. Better to use the goto out; exit.

In D7589#162441, @manu wrote:
In D7589#157710, @imp wrote:

It has by far caused us no end of pain. What's part10:? Our systems have 4 drives with 9 partitions on each one (except the ones that have 2, surprise!). The ordering is absolutely random as well, it seems. Some systems get part10 being the boot disk, others get part 2. Totally nuts.

I would like it if loader.efi understood partitions. @manu looked into it at BSDCan, however from my understanding, he found it was easier to teach U-Boot to expose partitions to loader.efi than add the existing GPT/MBR code to loader.efi.

If someone was to get it working I would imagine we could rename the partitions to be something like disk0p0 or disk1s3. The device name may be different than the kernel, however the partition id should be the same.

I think that the problem I got during bsdcan was boot1 unable to work well with unaligned partition so I would need to find time to look at this again (because of course I don't have the patches anymore).

On the review itself I'm a bit concerned about what this would mean for U-Boot.
I admit that I haven't read all the code but it seems that with this patch the UEFI implementation would need to support EFI_SIMPLE_FILE_SYSTEM (which U-Boot doesn't).
This is true that arm/arm64 aren't Tier-1 arch like AMD64 and that one UEFI implementation is required to support this protocol (from my understanding of the spec) but all the effort I've put on running boot1.efi/loader.efi on arm/arm64 (because this is the right way to take) would be useless if this is the case.

Erik: Can you confirm or not that EFI_SIMPLE_FILE_SYSTEM is needed ?
If this is the case we would need a fallback on what boot1/loader is currently doing.

Since we're implementing a EFI_SIMPLE_FILE_SYSTEM for UFS and ZFS, why would u-boot not supporting it matter?

Warner

In D7589#162585, @imp wrote:
In D7589#162441, @manu wrote:
In D7589#157710, @imp wrote:

It has by far caused us no end of pain. What's part10:? Our systems have 4 drives with 9 partitions on each one (except the ones that have 2, surprise!). The ordering is absolutely random as well, it seems. Some systems get part10 being the boot disk, others get part 2. Totally nuts.

I would like it if loader.efi understood partitions. @manu looked into it at BSDCan, however from my understanding, he found it was easier to teach U-Boot to expose partitions to loader.efi than add the existing GPT/MBR code to loader.efi.

If someone was to get it working I would imagine we could rename the partitions to be something like disk0p0 or disk1s3. The device name may be different than the kernel, however the partition id should be the same.

I think that the problem I got during bsdcan was boot1 unable to work well with unaligned partition so I would need to find time to look at this again (because of course I don't have the patches anymore).

On the review itself I'm a bit concerned about what this would mean for U-Boot.
I admit that I haven't read all the code but it seems that with this patch the UEFI implementation would need to support EFI_SIMPLE_FILE_SYSTEM (which U-Boot doesn't).
This is true that arm/arm64 aren't Tier-1 arch like AMD64 and that one UEFI implementation is required to support this protocol (from my understanding of the spec) but all the effort I've put on running boot1.efi/loader.efi on arm/arm64 (because this is the right way to take) would be useless if this is the case.

Erik: Can you confirm or not that EFI_SIMPLE_FILE_SYSTEM is needed ?
If this is the case we would need a fallback on what boot1/loader is currently doing.

Since we're implementing a EFI_SIMPLE_FILE_SYSTEM for UFS and ZFS, why would u-boot not supporting it matter?

Warner

After reading the code, the problem would not be EFI_SIMPLE_FILE_SYSTEM but InstallMultipleProtocolInterfaces which uboot doesn't support (it always returns EFI_EXIT(EFI_OUT_OF_RESOURCES);)

cem added inline comments.
sys/boot/efi/boot1/Makefile
32–37

Combine these two sections?

sys/boot/efi/libefi/errno.c
38

switch space left-paren

sys/boot/efi/libefi/handles.c
41–42

These are initialized with zero values by definition; the change isn't needed.

49–50

spaces missing

70–73

This rename from unit to i seems gratuitous. What's the new unit value used for? The diff doesn't have context so I can't tell.

sys/boot/efi/libefi/string16.c
32–35

strlen should return size_t, not int; both the return value and i should change to size_t.

37

str[i] != 0

42

space between CHAR16 and asterisk.

43

Opening brace of a function goes on a line by itself.

49–50

Same issues as strcpy16 above.

51

*src != 0

65

I think the treatment of >8-bit CHAR16s (and perhaps >7-bit) probably needs to be better than just truncating them. Replace with '?' or similar is fine; UTF-8 encoding seems excessive but would be fine. What do we expect the output encoding of this routine to be? US-ASCII? UTF-8?

73–74

Same issues as strcpy16 above.

My biggest concern is that this is a giant patch that does a bunch of different things. I'd like to see smaller commits that do one thing at a time.

Probably the easier way to manage that is a git stack and arc diff --update Dfoo HEAD^^^..HEAD^^ or whatever. That would also solve the missing context.

There are still a bunch of style problems that need fixing.

Thanks for doing this work. It looks like a great start.

eric_metricspace.net edited edge metadata.
eric_metricspace.net set the repository for this revision to rS FreeBSD src repository - subversion.
eric_metricspace.net marked 2 inline comments as done.

Fix a bunch of style issues. Still need to restore preferred partition support.

Warning level is set to 3 in boot1 to match warning level in loader. The reason for this in both cases is that the declaration of netif_drivers causes a warning at level 6. This is a general problem, due to the lack of an extern declaration of netif_drivers in any header file, and is out-of-scope for this patch.

eric_metricspace.net added inline comments.
sys/boot/efi/boot1/Makefile
11

This is done for the same reason that the warning level is set to 3 in loader. The declaration of netif_drivers causes a warning, because netif_drivers is not declared in any header file.

I would ask that issue be addressed in a separate fix.

sys/boot/efi/boot1/boot1.c
423–427

Actually, the cleanups vary depending on the position. I don't think it would be feasible to do it that way.

sys/boot/efi/include/efiprot.h
565–566

Not sure what this implies for this fix, if anything.

sys/boot/efi/libefi/handles.c
70–73

The unit variable is indeed used later, and in a place where it may differ from i.

In D7589#162756, @cem wrote:

My biggest concern is that this is a giant patch that does a bunch of different things. I'd like to see smaller commits that do one thing at a time.

Unfortunately, I don't see a way it could be broken up without seriously complicating things.

One thing to consider here is that this changes the interaction of boot1 and loader. Specifically, boot1 is now registering EFI drivers that may then be used by loader. While I have tested to make sure that things work fine with all combinations of a new and old boot1 and loader, it's probably better to do the refactoring as an atomic set so that we minimize the number of people with odd configurations.

This is especially true since I have GELI support coming right on the heels of this refactoring.

In D7589#162585, @imp wrote:
In D7589#162441, @manu wrote:
In D7589#157710, @imp wrote:

It has by far caused us no end of pain. What's part10:? Our systems have 4 drives with 9 partitions on each one (except the ones that have 2, surprise!). The ordering is absolutely random as well, it seems. Some systems get part10 being the boot disk, others get part 2. Totally nuts.

I would like it if loader.efi understood partitions. @manu looked into it at BSDCan, however from my understanding, he found it was easier to teach U-Boot to expose partitions to loader.efi than add the existing GPT/MBR code to loader.efi.

If someone was to get it working I would imagine we could rename the partitions to be something like disk0p0 or disk1s3. The device name may be different than the kernel, however the partition id should be the same.

I think that the problem I got during bsdcan was boot1 unable to work well with unaligned partition so I would need to find time to look at this again (because of course I don't have the patches anymore).

On the review itself I'm a bit concerned about what this would mean for U-Boot.
I admit that I haven't read all the code but it seems that with this patch the UEFI implementation would need to support EFI_SIMPLE_FILE_SYSTEM (which U-Boot doesn't).
This is true that arm/arm64 aren't Tier-1 arch like AMD64 and that one UEFI implementation is required to support this protocol (from my understanding of the spec) but all the effort I've put on running boot1.efi/loader.efi on arm/arm64 (because this is the right way to take) would be useless if this is the case.

Erik: Can you confirm or not that EFI_SIMPLE_FILE_SYSTEM is needed ?
If this is the case we would need a fallback on what boot1/loader is currently doing.

Since we're implementing a EFI_SIMPLE_FILE_SYSTEM for UFS and ZFS, why would u-boot not supporting it matter?

Warner

Yes, EFI_SIMPLE_FILE_SYSTEM is essential.

I do not know what uboot is exactly, but as Warner points out, the efifs driver provides an EFI_SIMPLE_FILE_SYSTEM interface for all existing filesystem drivers. Thus, it should be possible to include those in any system presenting an EFI-style interface.

In D7589#162680, @manu wrote:

After reading the code, the problem would not be EFI_SIMPLE_FILE_SYSTEM but InstallMultipleProtocolInterfaces which uboot doesn't support (it always returns EFI_EXIT(EFI_OUT_OF_RESOURCES);)

The EFI driver writer's guide (and other EFI specs) state that InstallMultipleProtocolInterfaces is the preferred method for registering protocol interfaces, and that InstallProtocolInterface should be avoided.

I think it would be preferable to implement the method in uboot (it shouldn't be hard to do; just a loop over a va_args).

To proceed, I need guidance on what exactly should be done to restore the preferred functionality.

So, for the preferred functionality, we want to prefer the device path from which we loaded the image for boot1? Is this correct?

In D7589#162680, @manu wrote:

After reading the code, the problem would not be EFI_SIMPLE_FILE_SYSTEM but InstallMultipleProtocolInterfaces which uboot doesn't support (it always returns EFI_EXIT(EFI_OUT_OF_RESOURCES);)

The EFI driver writer's guide (and other EFI specs) state that InstallMultipleProtocolInterfaces is the preferred method for registering protocol interfaces, and that InstallProtocolInterface should be avoided.

I think it would be preferable to implement the method in uboot (it shouldn't be hard to do; just a loop over a va_args).

I concur. Someone should prepare a patch to u-boot for this.

To proceed, I need guidance on what exactly should be done to restore the preferred functionality.

So, for the preferred functionality, we want to prefer the device path from which we loaded the image for boot1? Is this correct?

The preferred boot order is:
(1) Any zpool found on the same disk as the boot device.
(2) Any ufs partition found on the same disk

If those aren't found we pick the first one we can find.

Ideally, we'd use the EFI boot variables first though.

The preferred partition functionality is restored. The boot1 program now obtains the device handle from the loaded image handle, and checks for the presence of an EFI_SIMPLE_FILE_SYSTEM interface after attaching all drivers before scanning all partitions. This will have the effect of preferring the boot device on which the boot1 program resides.

sys/boot/efi/boot1/boot1.c
497

Restored in the latest update

eric_metricspace.net set the repository for this revision to rS FreeBSD src repository - subversion.
eric_metricspace.net marked an inline comment as done.

IFC, also properly restored preferred functionality

There isn't a need to clean up memory after boot1, and in fact, that will probably break things. Once an EFI driver is initialized and attached to handles, it stays around until ExitBootServices or until the driver is unloaded (which currently isn't done ever).

boot1 and loader both attempt to attach their drivers to each handle, but at most one of them (usually boot1) will succeed. The only reason loader does a driver initialization/attach pass is to handle the case where GRUB is loading loader directly, in fact. If you know for a fact you're using boot1, you could compile those bits out of loader completely.

The GELI patch actually needs this to be the case, as it's how the patch ensures that the user only ever gets asked for a password once.

You obviously can not clear the memory for FS driver bits as it is running, but the rest is

In D7589#157710, @imp wrote:

It has by far caused us no end of pain. What's part10:? Our systems have 4 drives with 9 partitions on each one (except the ones that have 2, surprise!). The ordering is absolutely random as well, it seems. Some systems get part10 being the boot disk, others get part 2. Totally nuts.

I would like it if loader.efi understood partitions. @manu looked into it at BSDCan, however from my understanding, he found it was easier to teach U-Boot to expose partitions to loader.efi than add the existing GPT/MBR code to loader.efi.

If someone was to get it working I would imagine we could rename the partitions to be something like disk0p0 or disk1s3. The device name may be different than the kernel, however the partition id should be the same.

sys/boot/efi/boot1/Makefile
109

-lstand alone does not work well, it means the /usr/lib/libstand.a will be used or some such -- I got an error:

if nm boot1.sym | grep ' U '; then echo "Undefined symbols in boot1.sym"; exit 1; fi

U stpcpy

Undefined symbols in boot1.sym

  • Error code 1

it either needs -L with path or just keep ${LIBSTAND}

I did build from current state, the patch did apply clean, build faile due to -lstand issue noted before, thats simple to fix.

Unfortunately the boot is not going well, last message is:

Initializing modules: FS Backend-

the '-' is spinner it seems. The VM has 4 disks, illumos mirror zfs on first 2, third has fbsd, 4th (for this test) has EFI system partition, and zfs and ufs partitions, zfs partition has bootfs with /boot directory.

I did build from current state, the patch did apply clean, build faile due to -lstand issue noted before, thats simple to fix.

Unfortunately the boot is not going well, last message is:

Initializing modules: FS Backend-

the '-' is spinner it seems. The VM has 4 disks, illumos mirror zfs on first 2, third has fbsd, 4th (for this test) has EFI system partition, and zfs and ufs partitions, zfs partition has bootfs with /boot directory.

Failing in FS Backend would indicate that something's going wrong with the filesystem attach phase. The FS Backend initialization scans all device handles and attaches the right filesystem drivers.

If have been testing on my own laptops with 2 disks, one with an EFI system partition and a ZFS zolume, one with a ZFS cache and log.

Therefore, the most likely culprit is UFS, though multiple ZFS volumes or the illumos mirror could be a problem.

What VM are you using? The easiest way to hunt this down would be to replicate your setup locally. The second easiest would be to get you a patch with extra logging.

I did build from current state, the patch did apply clean, build faile due to -lstand issue noted before, thats simple to fix.

Unfortunately the boot is not going well, last message is:

Initializing modules: FS Backend-

the '-' is spinner it seems. The VM has 4 disks, illumos mirror zfs on first 2, third has fbsd, 4th (for this test) has EFI system partition, and zfs and ufs partitions, zfs partition has bootfs with /boot directory.

I did build from current state, the patch did apply clean, build faile due to -lstand issue noted before, thats simple to fix.

Unfortunately the boot is not going well, last message is:

Initializing modules: FS Backend-

the '-' is spinner it seems. The VM has 4 disks, illumos mirror zfs on first 2, third has fbsd, 4th (for this test) has EFI system partition, and zfs and ufs partitions, zfs partition has bootfs with /boot directory.

Failing in FS Backend would indicate that something's going wrong with the filesystem attach phase. The FS Backend initialization scans all device handles and attaches the right filesystem drivers.

If have been testing on my own laptops with 2 disks, one with an EFI system partition and a ZFS zolume, one with a ZFS cache and log.

Therefore, the most likely culprit is UFS, though multiple ZFS volumes or the illumos mirror could be a problem.

What VM are you using? The easiest way to hunt this down would be to replicate your setup locally. The second easiest would be to get you a patch with extra logging.

vmware fusion. The illumos zfs mirror by itself should not be problem as loader zfs reader can read it both ways without problems (illumos loader can access fbsd pools and vice versa), also multiple pools should not be an issue as such - and definitely is scenario for tests. UFS may be issue, altho in this test setup, its just empty, I had that idea to create /boot on it as next step:) There is also another potential pain point - the memory allocations - as boot1 is now linked with libstand, could it be the malloc/free used all around are interfering somehow...

I did build from current state, the patch did apply clean, build faile due to -lstand issue noted before, thats simple to fix.

Unfortunately the boot is not going well, last message is:

Initializing modules: FS Backend-

the '-' is spinner it seems. The VM has 4 disks, illumos mirror zfs on first 2, third has fbsd, 4th (for this test) has EFI system partition, and zfs and ufs partitions, zfs partition has bootfs with /boot directory.

I did build from current state, the patch did apply clean, build faile due to -lstand issue noted before, thats simple to fix.

Unfortunately the boot is not going well, last message is:

Initializing modules: FS Backend-

the '-' is spinner it seems. The VM has 4 disks, illumos mirror zfs on first 2, third has fbsd, 4th (for this test) has EFI system partition, and zfs and ufs partitions, zfs partition has bootfs with /boot directory.

Failing in FS Backend would indicate that something's going wrong with the filesystem attach phase. The FS Backend initialization scans all device handles and attaches the right filesystem drivers.

If have been testing on my own laptops with 2 disks, one with an EFI system partition and a ZFS zolume, one with a ZFS cache and log.

Therefore, the most likely culprit is UFS, though multiple ZFS volumes or the illumos mirror could be a problem.

What VM are you using? The easiest way to hunt this down would be to replicate your setup locally. The second easiest would be to get you a patch with extra logging.

vmware fusion. The illumos zfs mirror by itself should not be problem as loader zfs reader can read it both ways without problems (illumos loader can access fbsd pools and vice versa), also multiple pools should not be an issue as such - and definitely is scenario for tests. UFS may be issue, altho in this test setup, its just empty, I had that idea to create /boot on it as next step:) There is also another potential pain point - the memory allocations - as boot1 is now linked with libstand, could it be the malloc/free used all around are interfering somehow...

The AllocatePool usage probably should be removed. It's a holdover from the old boot1 code. I'll try removing it, and then testing with UFS. If you're still seeing failures after that, and I don't see them with UFS partitions, we can look into diagnostic options.

eric_metricspace.net removed rS FreeBSD src repository - subversion as the repository for this revision.

Replaced uses of AllocatePool with malloc. Tested with a UFS partition, no problem.

tsoome, could you see if this affects anything? If not, then email me directly to coordinate diagnosis of the bug your setup found.

I did build from current state, the patch did apply clean, build faile due to -lstand issue noted before, thats simple to fix.

Unfortunately the boot is not going well, last message is:

Initializing modules: FS Backend-

the '-' is spinner it seems. The VM has 4 disks, illumos mirror zfs on first 2, third has fbsd, 4th (for this test) has EFI system partition, and zfs and ufs partitions, zfs partition has bootfs with /boot directory.

I did build from current state, the patch did apply clean, build faile due to -lstand issue noted before, thats simple to fix.

Unfortunately the boot is not going well, last message is:

Initializing modules: FS Backend-

the '-' is spinner it seems. The VM has 4 disks, illumos mirror zfs on first 2, third has fbsd, 4th (for this test) has EFI system partition, and zfs and ufs partitions, zfs partition has bootfs with /boot directory.

Failing in FS Backend would indicate that something's going wrong with the filesystem attach phase. The FS Backend initialization scans all device handles and attaches the right filesystem drivers.

If have been testing on my own laptops with 2 disks, one with an EFI system partition and a ZFS zolume, one with a ZFS cache and log.

Therefore, the most likely culprit is UFS, though multiple ZFS volumes or the illumos mirror could be a problem.

What VM are you using? The easiest way to hunt this down would be to replicate your setup locally. The second easiest would be to get you a patch with extra logging.

vmware fusion. The illumos zfs mirror by itself should not be problem as loader zfs reader can read it both ways without problems (illumos loader can access fbsd pools and vice versa), also multiple pools should not be an issue as such - and definitely is scenario for tests. UFS may be issue, altho in this test setup, its just empty, I had that idea to create /boot on it as next step:) There is also another potential pain point - the memory allocations - as boot1 is now linked with libstand, could it be the malloc/free used all around are interfering somehow...

Replaced uses of AllocatePool with malloc. Tested with a UFS partition, no problem.

tsoome, could you see if this affects anything? If not, then email me directly to coordinate diagnosis of the bug your setup found.

I am getting rejects on patch, could you rebase on current head?:)

tsoome@freebsd-2:/usr/eric % find . -name \*.rej
./lib/libstand/stand.h.rej
./sys/boot/efi/boot1/zfs_module.c.rej
./sys/boot/efi/boot1/ufs_module.c.rej
./sys/boot/efi/boot1/boot_module.h.rej
tsoome@freebsd-2:/usr/eric %

I did build from current state, the patch did apply clean, build faile due to -lstand issue noted before, thats simple to fix.

Unfortunately the boot is not going well, last message is:

Initializing modules: FS Backend-

the '-' is spinner it seems. The VM has 4 disks, illumos mirror zfs on first 2, third has fbsd, 4th (for this test) has EFI system partition, and zfs and ufs partitions, zfs partition has bootfs with /boot directory.

I did build from current state, the patch did apply clean, build faile due to -lstand issue noted before, thats simple to fix.

Unfortunately the boot is not going well, last message is:

Initializing modules: FS Backend-

the '-' is spinner it seems. The VM has 4 disks, illumos mirror zfs on first 2, third has fbsd, 4th (for this test) has EFI system partition, and zfs and ufs partitions, zfs partition has bootfs with /boot directory.

Failing in FS Backend would indicate that something's going wrong with the filesystem attach phase. The FS Backend initialization scans all device handles and attaches the right filesystem drivers.

If have been testing on my own laptops with 2 disks, one with an EFI system partition and a ZFS zolume, one with a ZFS cache and log.

Therefore, the most likely culprit is UFS, though multiple ZFS volumes or the illumos mirror could be a problem.

What VM are you using? The easiest way to hunt this down would be to replicate your setup locally. The second easiest would be to get you a patch with extra logging.

vmware fusion. The illumos zfs mirror by itself should not be problem as loader zfs reader can read it both ways without problems (illumos loader can access fbsd pools and vice versa), also multiple pools should not be an issue as such - and definitely is scenario for tests. UFS may be issue, altho in this test setup, its just empty, I had that idea to create /boot on it as next step:) There is also another potential pain point - the memory allocations - as boot1 is now linked with libstand, could it be the malloc/free used all around are interfering somehow...

Replaced uses of AllocatePool with malloc. Tested with a UFS partition, no problem.

tsoome, could you see if this affects anything? If not, then email me directly to coordinate diagnosis of the bug your setup found.

I am getting rejects on patch, could you rebase on current head?:)

tsoome@freebsd-2:/usr/eric % find . -name \*.rej
./lib/libstand/stand.h.rej
./sys/boot/efi/boot1/zfs_module.c.rej
./sys/boot/efi/boot1/ufs_module.c.rej
./sys/boot/efi/boot1/boot_module.h.rej
tsoome@freebsd-2:/usr/eric %

Strange. I had rebased just before I sent this up. I am rebasing from the github master, if that changes things at all. The last commit in the history is an hour ago.

My github repo for this is here: https://github.com/emc2/freebsd/tree/efize_new

Be aware: I'm using rebase -i to squash everything down to one commit at this point, so there's a lot of forced pushes going into efize_new

I am getting rejects on patch, could you rebase on current head?:)

tsoome@freebsd-2:/usr/eric % find . -name \*.rej
./lib/libstand/stand.h.rej
./sys/boot/efi/boot1/zfs_module.c.rej
./sys/boot/efi/boot1/ufs_module.c.rej
./sys/boot/efi/boot1/boot_module.h.rej
tsoome@freebsd-2:/usr/eric %

Strange. I had rebased just before I sent this up. I am rebasing from the github master, if that changes things at all. The last commit in the history is an hour ago.

My github repo for this is here: https://github.com/emc2/freebsd/tree/efize_new

Be aware: I'm using rebase -i to squash everything down to one commit at this point, so there's a lot of forced pushes going into efize_new

that may explain, as my fbsd source instances are all against svn ... so could it be the git has been missing some updates?

I am getting rejects on patch, could you rebase on current head?:)

tsoome@freebsd-2:/usr/eric % find . -name \*.rej
./lib/libstand/stand.h.rej
./sys/boot/efi/boot1/zfs_module.c.rej
./sys/boot/efi/boot1/ufs_module.c.rej
./sys/boot/efi/boot1/boot_module.h.rej
tsoome@freebsd-2:/usr/eric %

Strange. I had rebased just before I sent this up. I am rebasing from the github master, if that changes things at all. The last commit in the history is an hour ago.

My github repo for this is here: https://github.com/emc2/freebsd/tree/efize_new

Be aware: I'm using rebase -i to squash everything down to one commit at this point, so there's a lot of forced pushes going into efize_new

that may explain, as my fbsd source instances are all against svn ... so could it be the git has been missing some updates?

There's probably some lag going on somewhere. I see Ed Maste did push a change to the EFI loaders.

It might make more sense to build off of my repo in order to diagnose the issue your setup found.

I am getting rejects on patch, could you rebase on current head?:)

tsoome@freebsd-2:/usr/eric % find . -name \*.rej
./lib/libstand/stand.h.rej
./sys/boot/efi/boot1/zfs_module.c.rej
./sys/boot/efi/boot1/ufs_module.c.rej
./sys/boot/efi/boot1/boot_module.h.rej
tsoome@freebsd-2:/usr/eric %

Strange. I had rebased just before I sent this up. I am rebasing from the github master, if that changes things at all. The last commit in the history is an hour ago.

My github repo for this is here: https://github.com/emc2/freebsd/tree/efize_new

Be aware: I'm using rebase -i to squash everything down to one commit at this point, so there's a lot of forced pushes going into efize_new

that may explain, as my fbsd source instances are all against svn ... so could it be the git has been missing some updates?

There's probably some lag going on somewhere. I see Ed Maste did push a change to the EFI loaders.

It might make more sense to build off of my repo in order to diagnose the issue your setup found.

Note, i may have been hit also by my own bug... see D8644

I am getting rejects on patch, could you rebase on current head?:)

tsoome@freebsd-2:/usr/eric % find . -name \*.rej
./lib/libstand/stand.h.rej
./sys/boot/efi/boot1/zfs_module.c.rej
./sys/boot/efi/boot1/ufs_module.c.rej
./sys/boot/efi/boot1/boot_module.h.rej
tsoome@freebsd-2:/usr/eric %

Strange. I had rebased just before I sent this up. I am rebasing from the github master, if that changes things at all. The last commit in the history is an hour ago.

My github repo for this is here: https://github.com/emc2/freebsd/tree/efize_new

Be aware: I'm using rebase -i to squash everything down to one commit at this point, so there's a lot of forced pushes going into efize_new

that may explain, as my fbsd source instances are all against svn ... so could it be the git has been missing some updates?

There's probably some lag going on somewhere. I see Ed Maste did push a change to the EFI loaders.

It might make more sense to build off of my repo in order to diagnose the issue your setup found.

Note, i may have been hit also by my own bug... see D8644

Interesting...

I'm not sure what to do about this if that's the case.

In any case, I should probably put out a CFT on this patch, given your finding.

I've posted a CFT to -hackers, -current, and -amd64.

tsoome, can I get you to build and test with the extra_logging branch on my github repo: https://github.com/emc2/freebsd

The CFT turned up someone who's having the same error, and I have a few possible root causes I suspect. The extra_logging branch has some extra logging that should help track it down.

tsoome, can I get you to build and test with the extra_logging branch on my github repo: https://github.com/emc2/freebsd

The CFT turned up someone who's having the same error, and I have a few possible root causes I suspect. The extra_logging branch has some extra logging that should help track it down.

Sorry, I haven't had time to check this one - I had bunch of other issues to attend to:( I'll try to get some time to grab the git branch and have test build...

OK, based on others' reproduction of tsoome's issue on -hackers, I have a credible diagnostic theory.

The new EFI code will try to attach filesystem drivers to every partition, and will end up trying every driver on any partition that doesn't actually contain a filesystem it can recognize. The most obvious example of this would be freebsd-boot partitions containing BIOS boot code. Another would be something containing a non-filesystem. If there's a bug in the dosfs code, this would likely trigger it.

The reason I couldn't reproduce it is that my UFS partition got detected as UFS first, and my dosfs partition was bound by the firmware, so it already had an EFI_SIMPLE_FILESYSTEM interface before the boot code ran (so it didn't probe the partition).

The coup-de-gras here would be to either commit the fix for the dosfs bug, or else disable the dosfs driver and see if it still hangs.

Another issue: since the UEFI spec guarantees that a driver exists which attaches an EFI_SIMPLE_FILESYSTEM interface to any FAT32 partition, do we really need our own dosfs driver in the EFI boot loader? I would think not.

OK, based on others' reproduction of tsoome's issue on -hackers, I have a credible diagnostic theory.

The new EFI code will try to attach filesystem drivers to every partition, and will end up trying every driver on any partition that doesn't actually contain a filesystem it can recognize. The most obvious example of this would be freebsd-boot partitions containing BIOS boot code. Another would be something containing a non-filesystem. If there's a bug in the dosfs code, this would likely trigger it.

The reason I couldn't reproduce it is that my UFS partition got detected as UFS first, and my dosfs partition was bound by the firmware, so it already had an EFI_SIMPLE_FILESYSTEM interface before the boot code ran (so it didn't probe the partition).

The coup-de-gras here would be to either commit the fix for the dosfs bug, or else disable the dosfs driver and see if it still hangs.

Another issue: since the UEFI spec guarantees that a driver exists which attaches an EFI_SIMPLE_FILESYSTEM interface to any FAT32 partition, do we really need our own dosfs driver in the EFI boot loader? I would think not.

Yep, we can skip dosfs there, and for very simple reason - if the firmware is not able to read system partition, there is no boot anyhow.

But, I started to think on another problem. If we have all disks scanned and all file systems probed, can we end up having all those loaded in memory at once? what will happen with memory consumption and more importantly, with fragmentation? Note the loader.efi is using fixed 64MB heap for libstand zmalloc, but boot1 is using UEFI memory API, and it can get really ugly with systems with many disks. Also, for example the qemu memory allocator can get the pieces from quite random places... Just checking because there was a bit of noise about increasing the heap in loader related to very small memory configurations...

Yep, we can skip dosfs there, and for very simple reason - if the firmware is not able to read system partition, there is no boot anyhow.

I'm currently testing a build with the dosfs driver removed. Please try it on your setup when I push it to efize_new

But, I started to think on another problem. If we have all disks scanned and all file systems probed, can we end up having all those loaded in memory at once? what will happen with memory consumption and more importantly, with fragmentation? Note the loader.efi is using fixed 64MB heap for libstand zmalloc, but boot1 is using UEFI memory API, and it can get really ugly with systems with many disks. Also, for example the qemu memory allocator can get the pieces from quite random places... Just checking because there was a bit of noise about increasing the heap in loader related to very small memory configurations...

Not anymore. Both are using libstand malloc now, which, on an EFI system, I believe that uses AllocatePool to obtain its memory. Now, the boot1 drivers will continue to use their pool, and loader will obtain its own pool, so there's a bit of suboptimal behavior going on there, but there's not really any clean way to communicate arbitrary data between boot1 and loader (particularly not addresses). Moreover, you've got possible setups where someone's loading loader from GRUB or rEFInd, so I wouldn't recommend trying to engineer something like that.

ok, got it built, and initial test. And there is this problem...

OK lsdev

EFI0:    EFI(SIMPLE_FILE_SYSTEM)
EFI1:    EFI(SIMPLE_FILE_SYSTEM)
EFI2:    EFI(SIMPLE_FILE_SYSTEM)
EFI3:    EFI(SIMPLE_FILE_SYSTEM)
EFI4:    EFI(SIMPLE_FILE_SYSTEM)
EFI5:    EFI(SIMPLE_FILE_SYSTEM)
EFI6:    EFI(SIMPLE_FILE_SYSTEM)
EFI7:    EFI(SIMPLE_FILE_SYSTEM)

net devices:

net0:

OK

Second problem is that the devices are sorted by file system type, So first I get my 4 EFI System partitions (I have 4 disks on this VM), then the rest.

Third one is that for some reason it does list root directory, but not subdirectories - probably because:

? 270 dev
? 5 opt
? 20 kernel
? 3 code
? 2 raid
? 2 rpool
? 187 lib

the ls -l does see the size, but not an type, note this is zfs.

And finally, for some reasons sometimes the ls was running really slow - not always.

But the disk and file system identification is huge issue, because from current snapshot I can not get even the size hints;)

So, it is very impressive work, but it will take a lot still:) Well, it should be possible to use the disk+part interface I did and relate the file systems with actual partitions... but still there is huge question about identification, and for zfs case - as an admin, I really can not lose the boot environments....

ok, got it built, and initial test. And there is this problem...

OK lsdev

EFI0:    EFI(SIMPLE_FILE_SYSTEM)
EFI1:    EFI(SIMPLE_FILE_SYSTEM)
EFI2:    EFI(SIMPLE_FILE_SYSTEM)
EFI3:    EFI(SIMPLE_FILE_SYSTEM)
EFI4:    EFI(SIMPLE_FILE_SYSTEM)
EFI5:    EFI(SIMPLE_FILE_SYSTEM)
EFI6:    EFI(SIMPLE_FILE_SYSTEM)
EFI7:    EFI(SIMPLE_FILE_SYSTEM)

net devices:

net0:

OK

Yeah, that's problematic. I guess we're back to the question of what should go here: EFI device paths, or something else?

There is an EFI interface that allows a descriptive name to be attached to an arbitrary EFI_HANDLE. Perhaps this could be added as part of the probing process.

Second problem is that the devices are sorted by file system type, So first I get my 4 EFI System partitions (I have 4 disks on this VM), then the rest.

I gather they should be sorted by disk or partition, then? This should be relatively easy to achieve.

Third one is that for some reason it does list root directory, but not subdirectories - probably because:

? 270 dev
? 5 opt
? 20 kernel
? 3 code
? 2 raid
? 2 rpool
? 187 lib

the ls -l does see the size, but not an type, note this is zfs.

Probably an oversight. Actually, I seem to recall something about the EFI_SIMPLE_FILE_SYSTEM interface making you have to open a file to find out its type. I probably intended to add in file types later, but never did.

And finally, for some reasons sometimes the ls was running really slow - not always.

But the disk and file system identification is huge issue, because from current snapshot I can not get even the size hints;)

So, it is very impressive work, but it will take a lot still:) Well, it should be possible to use the disk+part interface I did and relate the file systems with actual partitions... but still there is huge question about identification, and for zfs case - as an admin, I really can not lose the boot environments....

OK, seems I've got some more work to do.

Added functionality to print out information about the filesystems with lsdev.

The way this work is through the EFI_COMPONENT_NAME interface. The fs_driver now attaches these to every handle. They return the name that would be provided by the underlying driver's print function.

The efifs driver's print function looks for an EFI_COMPONENT_NAME interface, and uses that if it exists. Verbose mode also now prints out the EFI device path correctly.

Please test this and make sure it works properly on complex setups.

It seems that I forgot about the file type issue with ls. I'll address that next, but please test the current patch to make sure it addresses the naming issue adequately.

Fixed issue with directory listing file types.

Please test everything and indicate any remaining issues that need to be resolved.

tsoome, could you try out the new changes on your complex disk setup and let me know what the output looks like?

ok, got it built, and initial test. And there is this problem...

OK lsdev

EFI0:    EFI(SIMPLE_FILE_SYSTEM)
EFI1:    EFI(SIMPLE_FILE_SYSTEM)
EFI2:    EFI(SIMPLE_FILE_SYSTEM)
EFI3:    EFI(SIMPLE_FILE_SYSTEM)
EFI4:    EFI(SIMPLE_FILE_SYSTEM)
EFI5:    EFI(SIMPLE_FILE_SYSTEM)
EFI6:    EFI(SIMPLE_FILE_SYSTEM)
EFI7:    EFI(SIMPLE_FILE_SYSTEM)

net devices:

net0:

OK

Yeah, that's problematic. I guess we're back to the question of what should go here: EFI device paths, or something else?

There is an EFI interface that allows a descriptive name to be attached to an arbitrary EFI_HANDLE. Perhaps this could be added as part of the probing process.

Second problem is that the devices are sorted by file system type, So first I get my 4 EFI System partitions (I have 4 disks on this VM), then the rest.

I gather they should be sorted by disk or partition, then? This should be relatively easy to achieve.

Third one is that for some reason it does list root directory, but not subdirectories - probably because:

? 270 dev
? 5 opt
? 20 kernel
? 3 code
? 2 raid
? 2 rpool
? 187 lib

the ls -l does see the size, but not an type, note this is zfs.

Probably an oversight. Actually, I seem to recall something about the EFI_SIMPLE_FILE_SYSTEM interface making you have to open a file to find out its type. I probably intended to add in file types later, but never did.

And finally, for some reasons sometimes the ls was running really slow - not always.

But the disk and file system identification is huge issue, because from current snapshot I can not get even the size hints;)

So, it is very impressive work, but it will take a lot still:) Well, it should be possible to use the disk+part interface I did and relate the file systems with actual partitions... but still there is huge question about identification, and for zfs case - as an admin, I really can not lose the boot environments....

OK, seems I've got some more work to do.

With partition - filesystem mapping we can get the device handle without much issue - the handle stack has handles from FS down to partition and disk handle and the api to access the next handle from stack is there.

With zfs the situation is much more complicated as it is dual disk and file system and we need interface to access BE list and to switch the "current" filesystem on top of the "disk", and UEFI api itself does not help us there. It is possible to create specific protocol on top of zfs pool to implement an mechanism to get information about BE's and to switch, but it means that instead of reducing complexity, we are adding more.... note that the dataset list specifying BE's can be quite long, on my dev host I currently do have 73 entries.

Also, as I already wrote, for human operating the machine, it is really important to know exactly what I'm going to boot from.

tsoome, could you try out the new changes on your complex disk setup and let me know what the output looks like?

Yea, sorry I have been very busy, can you mail me the boot1 and loader binaries?

With partition - filesystem mapping we can get the device handle without much issue - the handle stack has handles from FS down to partition and disk handle and the api to access the next handle from stack is there.

With zfs the situation is much more complicated as it is dual disk and file system and we need interface to access BE list and to switch the "current" filesystem on top of the "disk", and UEFI api itself does not help us there. It is possible to create specific protocol on top of zfs pool to implement an mechanism to get information about BE's and to switch, but it means that instead of reducing complexity, we are adding more.... note that the dataset list specifying BE's can be quite long, on my dev host I currently do have 73 entries.

Also, as I already wrote, for human operating the machine, it is really important to know exactly what I'm going to boot from.

So what I did was use the COMPONENT_NAME2_PROTOCOL interface to report back the names of the devices. The raw device to EFI layer now installs a COMPONENT_NAME2_PROTOCOL interface when it installs the SIMPLE_FILE_SYSTEM_PROTOCOL interface.

The COMPONENT_NAME2_PROTOCOL interface currently uses code borrowed from the dev_print methods from the efipart and zfs underlying drivers. We could possibly refactor the dev_print method or add a new one in the future to make this a cleaner integration, but I want to avoid adding more refactorings than necessary here.

In some cases (such as when the firmware has installed a SIMPLE_FILE_SYSTEM_PROTOCOL), you'll get a handle with no COMPONENT_NAME2_PROTOCOL. In this case, it does the default behavior from before.

Also, on verbose mode, it prints out EFI device paths as well.

With partition - filesystem mapping we can get the device handle without much issue - the handle stack has handles from FS down to partition and disk handle and the api to access the next handle from stack is there.

With zfs the situation is much more complicated as it is dual disk and file system and we need interface to access BE list and to switch the "current" filesystem on top of the "disk", and UEFI api itself does not help us there. It is possible to create specific protocol on top of zfs pool to implement an mechanism to get information about BE's and to switch, but it means that instead of reducing complexity, we are adding more.... note that the dataset list specifying BE's can be quite long, on my dev host I currently do have 73 entries.

Also, as I already wrote, for human operating the machine, it is really important to know exactly what I'm going to boot from.

So what I did was use the COMPONENT_NAME2_PROTOCOL interface to report back the names of the devices. The raw device to EFI layer now installs a COMPONENT_NAME2_PROTOCOL interface when it installs the SIMPLE_FILE_SYSTEM_PROTOCOL interface.

The COMPONENT_NAME2_PROTOCOL interface currently uses code borrowed from the dev_print methods from the efipart and zfs underlying drivers. We could possibly refactor the dev_print method or add a new one in the future to make this a cleaner integration, but I want to avoid adding more refactorings than necessary here.

In some cases (such as when the firmware has installed a SIMPLE_FILE_SYSTEM_PROTOCOL), you'll get a handle with no COMPONENT_NAME2_PROTOCOL. In this case, it does the default behavior from before.

Also, on verbose mode, it prints out EFI device paths as well.

btw, there is also D8581 - it does work on another dimension however, by reusing disk and part API, and providing already familiar output for user. Anyhow, I am still thinking the biggest problem to get solved with your work is about how to deal with zfs BE switching, it really is quite essential from admin point of view. Perhaps the answer would be not really about file system layer, but disk layer.

btw, there is also D8581 - it does work on another dimension however, by reusing disk and part API, and providing already familiar output for user.

Eyeballing that one, I actually think it could probably auto-merge with my work and Just Work. Or maybe with some very minor edits.

Anyhow, I am still thinking the biggest problem to get solved with your work is about how to deal with zfs BE switching, it really is quite essential from admin point of view. Perhaps the answer would be not really about file system layer, but disk layer.

Boot environments actually aren't something I'm familiar with (I think they came into being after I did the ZFS+EFI work and devised my ZFS setups), so I'm going to need to do some research before addressing that. I'm hoping to clear the board of any other obvious issues first.

Update on this patch: I'm going to attempt to break it up into some smaller changesets to make it more manageable.