Page MenuHomeFreeBSD

zfs boot: prefer a pool with bootfs set for a boot pool
AbandonedPublic

Authored by avg on Dec 6 2021, 12:42 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Apr 22, 3:19 AM
Unknown Object (File)
Mar 21 2024, 2:27 AM
Unknown Object (File)
Jan 31 2024, 10:54 AM
Unknown Object (File)
Jan 21 2024, 9:57 PM
Unknown Object (File)
Jan 6 2024, 11:19 PM
Unknown Object (File)
Dec 23 2023, 12:01 AM
Unknown Object (File)
Dec 21 2023, 9:36 AM
Unknown Object (File)
Dec 1 2023, 5:26 PM
Subscribers

Details

Summary

In other words, ignore pools without bootfs property when considering
candidates for a bool pool. Use a pool without bootfs only if none of
discovered pools has it.

This should allow to easily designate a boot pool on a system with
multiple pools without having to shuffle disks, change firmware boot
order, etc.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 46753
Build 43642: arc lint + arc unit

Event Timeline

avg requested review of this revision.Dec 6 2021, 12:42 PM

What does 'first' mean? It's effectively a random list, in random order since there's no boot priority indicated. Why would you prefer something random here? What if there are two or three? This can lead to weird behavior.

I'd prefer if we preferred, where possible, the ZPOOL that has an element from the current disk we're booting from first, with a fallback to anything else marked as bootable since that's less unpredictable.

So I have some concerns.... With luck, one of the ZFS people will say there can be only one or something...

In D33302#752740, @imp wrote:

What does 'first' mean? It's effectively a random list, in random order since there's no boot priority indicated. Why would you prefer something random here? What if there are two or three? This can lead to weird behavior.

It means whatever it means now. The status quo is not changed.
As far as I recall, we probe disks in the "BIOS" order and their partitions in the natural order.
The pools are added in the order of their first vdev discovery.
From my experience with machines I owned, BIOS provided disks in the their configured boot order.
But, as I said in the beginning, whatever that order is, this change does not affect it.

I'd prefer if we preferred, where possible, the ZPOOL that has an element from the current disk we're booting from first, with a fallback to anything else marked as bootable since that's less unpredictable.

From my experience, this is the case already.

So I have some concerns.... With luck, one of the ZFS people will say there can be only one or something...

What this change does is selecting a pool where bootfs is set over pools where bootfs property is not set.
The relative order of pools remains the same as in the current code.

In D33302#752740, @imp wrote:

What does 'first' mean? It's effectively a random list, in random order since there's no boot priority indicated. Why would you prefer something random here? What if there are two or three? This can lead to weird behavior.

I'd prefer if we preferred, where possible, the ZPOOL that has an element from the current disk we're booting from first, with a fallback to anything else marked as bootable since that's less unpredictable.

So I have some concerns.... With luck, one of the ZFS people will say there can be only one or something...

Yes, I agree there, the "default" order of pools is what we do get from iterating the disks; (Some?) BIOS systems do set boot disk first in list, but I think, even EFI is different there.

It would make sense to check bootfs property on pools from boot disks for case of one boot pool + data pool(s) on boot disk; but even that can fire back in case you do have multiple boot pools.... (which you probably should not and use shared boot pool instead).

Only deterministic method would be to attach information about boot pool with loader - this would be pretty easy to do with UEFI, but is hassle with BIOS boot. In illumos, I did solve this by making boot program installer (installboot) to write partition LBA into dedicated variable slot in binary (this will break the signed binaries, ofc). In case of UEFI, we can use loader.env file mechanism we already do have, but still it would need a method to create this information....

Well, I do not want to solve all problems with this small patch.
I just want to improve the situation for most typical installs.
Single pool - no problems and nothing changes.
Single boot pool, one or more data pool - the situation becomes better, just set bootfs on the boot pool.
Multiple boot pools - the situation remains just as it was before.

@tsoome , could you please clarify the nature of our objection to this change?

In D33302#814333, @avg wrote:

@tsoome , could you please clarify the nature of our objection to this change?

ok, I guess, we do not really change the current behavior too much. The real problem is about to make sure we really do get pools from boot disk sorted first in pools list. Or in fact, once we have got boot pool vdevs, we actually should stop scanning remaining disks. But yes, that is another issue.

This revision is now accepted and ready to land.Jul 20 2022, 10:50 PM

(gpt)zfsboot used its own mechanism to pick up the boot pool, so it was not
affected by the earlier changes.
The boot pool was the first pool discovered while probing the boot disk.
Pools discovered via other disks were noted, but ignored.
Now (gpt)zfsboot has been changed to use spa_first_bootable().

This revision now requires review to proceed.Aug 1 2022, 4:24 PM

I realized, after feedback from testers, that I changed only the way in which loader would pick a boot pool.
But loader usually does not do that, it just uses a pool that is passed to it by an earlier stage (typically [gpt]zfsboot).
It turned out that zfsboot did not consult zfs_pools at all and used its own mechanism to select a boot pool.
That mechanism made use of bootinfo.bi_bios_dev and zfs_probe_dev.
I changed zfsboot to use the same boot poo selection mechanism that I had in mind for loader.

While looking into this, I noticed that efi_zfs_probe uses zfs_probe_dev and boot_img->DeviceHandle to set pool_guid in a somewhat similar fashion.
The EFI ZFS code probably needs an update as well.

Probe the boot device first to give it preferrential treatment.
In my tests this does not make a difference as the BIOS puts the boot disk
as the first disk anyway, but other implementations may have their quirks.

While controlling which pool is a boot via bootfs property sounds useful, it adds yet another degree of freedom in the boot configuration.
I don't want legacy "BIOS" boot and EFI boot to diverge and I don't have time to work on the EFI ZFS boot code.
I think that this change was good, but I think that we can manage just fine without it.
So, I am withdrawing it.