Page MenuHomeFreeBSD

Extend loader(8) geli support to all architectures and all disk-like devices.
ClosedPublic

Authored by ian on Jun 10 2018, 8:09 PM.

Details

Summary

This moves the bulk of the geli support from lib386/biosdisk.c into a new geli/gelidev.c which implements a devsw-type device whose dv_strategy() function handles geli decryption. Support for all arches comes from moving the code to taste-and-attach to the devopen() function in libsa.

After opening any DEVT_DISK device, devopen() calls the new function geli_probe_and_attach(), which will "attach" the geli code to the open_file struct by creating a geli_devdesc instance to replace the disk_devdesc instance in the open_file. That routes all IO for the device through the geli code.

A new public geli_add_key() function is added, to allow arch/vendor-specific code to add keys obtained from custom hardware or other sources.

With these changes, geli support will be compiled into all variations of loader(8) on all arches because the default is WITH_LOADER_GELI. Is that an appropriate default for all arches?

Test Plan

Compiles with make universe (both defaulting to WITH_LOADER_GELI and when built using WITHOUT_LOADER_GELI). Run-tested only on 32-bit arm at the moment.

Now passes all the legacy-bios tests generated by tools/boot/rootgen.sh. I'm not yet able to test uefi flavors.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

ian created this revision.Jun 10 2018, 8:09 PM
imp added a comment.Jun 10 2018, 8:38 PM

OK. Generally, I really like it.

I'll have to take a closer look at it than I have time for today over the next several days.

imp added inline comments.Jun 10 2018, 9:30 PM
stand/geli/geliboot.c
370 ↗(On Diff #43562)

One idea here is that you could make geli_passphrase() be a function pointer that's normally set to this function (perhaps renamed). Then you could get the keying material from TPM or fpga or whatever crazy thing in your unique environment.

stand/geli/gelidev.c
57 ↗(On Diff #43562)

We do special things in some places based on this, so I'm going to have to study this more carefully.

74 ↗(On Diff #43562)

I think this breaks some assumptions of other code that "know" about devdesc after looking at dv_type. I'd have to go looking for that, which I will tomorrow when I have more time.

ian added inline comments.Jun 10 2018, 10:55 PM
stand/geli/gelidev.c
74 ↗(On Diff #43562)

I guess to be safe from stuff like that, the first field should be a disk_devdesc, then if other code assumes because it's a DEVT_DISK it can safely look at desc->{d_slice,d_partition,d_offset} all would be well. Now that some of the ugliness in the biosdisk layer is gone, I'm not sure that happens anywhere.

This is much cleaner. Thank you for putting in the effort.

I'd like to get Warner's rootgen.sh tests to support GELI, and confirm that is working before we land this, so we can check for regressions. I expect to accomplish that before the week is out.

imp added inline comments.Jun 13 2018, 3:40 PM
stand/geli/geliboot.c
151 ↗(On Diff #43562)

is it intentional that we never free this when there's an error?

stand/geli/gelidev.c
130 ↗(On Diff #43562)

is int the right type here, or would int64_t be better?

Some thoughts here:

This solution is a different approach from what I took in my own GELI EFI work. The GELI patches I created (at this point, almost two years ago) were based around EFI drivers, and more generally, an EFI-based boot1 and loader architecture. An approach like this wouldn't have worked, as there was no way to get the data from boot1 to loader.

HOWEVER, The elimination of boot1 from the picture changes things considerably, and now this kind of architecture does work, and is probably how I would have done things had there been no boot1 when I began working on GELI EFI support originally.

This aside, my only comment is that I'd like to see an interface that could more easily tie in to an external key storage system, such as a KMS. I did something like this in my GELI work, and one of the precursor patches can be found here: https://reviews.freebsd.org/D12698. My EFI work always used the EFI KMS interface, and a "fake" in-memory KMS is provided for the cases where there is no genuine device. This could happen in a follow-on changeset, though.

ian added a comment.Jun 20 2018, 5:07 PM

Some thoughts here:
[...]
This aside, my only comment is that I'd like to see an interface that could more easily tie in to an external key storage system, such as a KMS. I did something like this in my GELI work, and one of the precursor patches can be found here: https://reviews.freebsd.org/D12698. My EFI work always used the EFI KMS interface, and a "fake" in-memory KMS is provided for the cases where there is no genuine device. This could happen in a follow-on changeset, though.

I added the geli_add_key() function so that keys obtained from unspecified external sources could be added to the keyring used internally by the geli code. The prior interface only allowed the import of an entire block of 0-n keys, and doing so completely replaced any existing keys that had been added earlier. Now components that know nothing of each other can all add keys from sources they're responsible for. I need such a feature for $work, but wanted to keep changes related to that as minimalist as possible, because this round of changes is big enough already. If a larger and more flexible key management subsystem is needed in loader, this minimalist approach now should make it easy to evolve towards that.

stand/geli/geliboot.c
151 ↗(On Diff #43562)

Yes it was intentional, my thinking was that this is now a known device and its presence in the list means we've probed it in the past and don't need to do IO again to figure out whether it's geli or not. But in looking at the code, I now think I allocate the kdev and link it into a the list a bit too early... there are a couple places that can exit for errors that don't exactly mean "we know it's not geli" (malloc or IO errors, all not-very-likely). I'll update that for the next revision.

stand/geli/gelidev.c
130 ↗(On Diff #43562)

There's no particular reason to make it 64 bits, the compiler will sign-extend it as needed, it's the unsignedness of the 2nd arg that causes the problem. (Stumbling across this 64 vs 32 quirk in roundup/down2() makes me want to examine every use of them in the whole source base, and add a big warning comment where the macros are defined.)

ian updated this revision to Diff 44166.Jun 20 2018, 5:36 PM

Adapt to the move of stand/geli to stand/libsa/geli.

Make the struct at the start of a geli_devdesc be a disk_devdesc rather than a plain devdesc, and populate its fields by copying them from the underlying host provider. This should make it safe for any code that sees devdesc.d_type == DEVT_DISK and assuming that it can then treat the struct as a disk_devdesc.

Don't create a known_device instance until we've succesfully allocated an IO buffer and read the last sector of the host provider into it. This will prevent linking a known_device into the list making it appear we know for sure the device is not geli-encrypted when the real problem was memory allocation or IO error (all pretty unlikely, but this is more correct).

ian updated this revision to Diff 44819.Jul 3 2018, 5:46 PM
ian edited the test plan for this revision. (Show Details)

Update the diff to fix a couple bugs.

  • In geli_dev_strategy(), free iobuf (allocated in the function), not buf (passed in by the caller).
  • Make geli_havekey() return ENOENT on error instead of 1.
  • Fix error early-exit logic in geli_probe() to not dereference a null pointer and to not record the is/is-not geli status of a drive if probing the drive gets an IO error.

I get compile errors trying to build the latest

ian marked 4 inline comments as done.Jul 4 2018, 1:49 PM

I get compile errors trying to build the latest

I get compile errors trying to build the latest

The phab auto-build thing says it builds, and it builds for me. If you get errors about .code16 in asm code, you need to update to the latest toolchain.

ian added a comment.Jul 13 2018, 4:41 PM

UEFI testing completed; this code now passes all the tests generated by tools/boot/rootgen.sh, and has also been tested on armv7 systems with ubldr. I think it is ready to commit.

imp accepted this revision.Jul 13 2018, 4:47 PM

let's get this in, modulo the one nit.

stand/libsa/geli/gelidev.c
324 ↗(On Diff #44819)

you should fix it.

This revision is now accepted and ready to land.Jul 13 2018, 4:47 PM
This revision was automatically updated to reflect the committed changes.