Page MenuHomeFreeBSD

userboot: handle guest interpreter mismatches more intelligently
ClosedPublic

Authored by kevans on Aug 29 2018, 4:46 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Dec 16, 12:58 PM
Unknown Object (File)
Sun, Dec 8, 10:50 AM
Unknown Object (File)
Sun, Dec 8, 10:50 AM
Unknown Object (File)
Sun, Dec 8, 10:50 AM
Unknown Object (File)
Sun, Dec 8, 10:49 AM
Unknown Object (File)
Sun, Dec 8, 10:49 AM
Unknown Object (File)
Sun, Dec 8, 10:49 AM
Unknown Object (File)
Sun, Dec 8, 10:49 AM
Subscribers

Details

Summary

The switch to lualoader creates a problem with userboot: the host is inclined to build userboot with Lua, but the host userboot's interpreter must match what's available on the guest. For almost all FreeBSD guests in the wild, Lua is not yet available and a Lua-based userboot will fail.

This revision updates userboot protocol to version 5, which adds a swap_interpreter callback to request a different interpreter, and tries to determine the proper interpreter to be used based on how the guest userboot.so is compiled. This is still a bit of a guess, but it's likely the best possible guess we can make in order to get it right. The interpreter is embedded in the version string, so we can open userboot.so on the guest and hunt that down to locate the interpreter it was built with.

Using -l with bhyveload will not allow an intepreter swap, even if the loader specified happens to be a userboot with the wrong interpreter. We'll simply complain about the mismatch and bail out.

For legacy guests without the updated version string, we assume they're 4th. For new guests with the updated version string, we'll read it and swap over to the proper interpreter if it doesn't match what the userboot we're using was compiled with.

Both flavors of userboot are installed by default, userboot_4th.so and userboot_lua.so. This fixes the build WITHOUT_FORTH as a coincidence, which was broken by userboot being forced to 4th.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

(Note to Rod: this is only partially what's going to end up in the review... this isn't a final product for review yet)

(Note to Rod: this is only partially what's going to end up in the review... this isn't a final product for review yet)

Just an Ack, I was waiting for more information anyway as the summary is very sparse so I figured it was WIP. I'll also wait to add the bhyve group until after this is more filled out, or either imp@ or you can do it when you feel it is ready.

kevans added a reviewer: imp.
kevans retitled this revision from userboot hacks to userboot: handle guest interpreter mismatches more intelligently.
kevans edited the summary of this revision. (Show Details)
kevans added a reviewer: bhyve.

(Puts on tomato shield)

A fast look over bhyveload it looks ok. I suppose you have done tests with these changes on bhyveload.

A fast look over bhyveload it looks ok. I suppose you have done tests with these changes on bhyveload.

Indeed- I tested all of the cases listed: legacy setup where it's forthloader with old version string, lua+forth guest with each of the userboots in place as /boot/userboot.so, tolerance for a requested interpreter not existing on the host, and explicitly requesting a loader with -l.

A fast look over bhyveload it looks ok. I suppose you have done tests with these changes on bhyveload.

Indeed- I tested all of the cases listed: legacy setup where it's forthloader with old version string, lua+forth guest with each of the userboots in place as /boot/userboot.so, tolerance for a requested interpreter not existing on the host, and explicitly requesting a loader with -l.

Great, thank you @kevans for all the effort you have put on it.

For the bhyveload side, I'm ok with the changes.

Address a minor nit... we can't return from the swap interpreter callback without blowing things up since we've since dlclose()d the loader we were using. This is likely sufficiently edge-casey to not matter, though: all paths through userboot should either explicitly exit or they kill the process.

Another possible issue I observed while re-combing through... I've left the environment intact from the first pass through userboot. currdev, LINES, and all cons_probe() related environment variables will exist in the second pass. Do we care? I suspect not a lot, since we'll redetect currdev and LINES is static; cons_probe() will pick up the results of the previous run.

I think you can get around the edge case of dlclose by using setjmp/longjmp as outlined.

usr.sbin/bhyveload/bhyveload.c
589 ↗(On Diff #47472)

I think you can just set loader = strdup(requested_interp) here, as well as reinit = 1. Then do a longjmp(). You don't need to do other refactoring.

665 ↗(On Diff #47472)

You'd need to initialize h to be NULL for my trick to work.

770 ↗(On Diff #47472)

And I'd keep this stuff. If h is != 0 when we get here, we do a dlclose(h); first since now it's off the stack and we can safely do that.

836 ↗(On Diff #47472)

Add a setjmp here.

852–856 ↗(On Diff #47472)

I wouldn't do this refactoring.

Greatly simplified bhyveload bits, as recommended by @imp. need_reinit, the dlopen handle, and the loader string itself move off the stack so they're not clobbered by the longjmp. The callback now triggers a vm_reinit as well. Previous testing was performed again, no regressions noticed.

jhb added inline comments.
stand/userboot/userboot/main.c
80 ↗(On Diff #47487)

This is almost strnstr(). I wonder if you could write the parsing in terms of more standard functions. It's not really clear what you are parsing and what format you are expecting the string to have. A comment describing that would probably be helpful.

130 ↗(On Diff #47487)

In the case of the file buffer, 'verstr' isn't necessarily NULL-terminated. You should use strnchr instead.

158 ↗(On Diff #47487)

So the only odd thing about this is that (if I'm reading this correctly) we are using /boot/userboot.so in the guest to decide which loader /boot/loader in the guest is? If the guest were an actual machine that booted (e.g. if you took the disk image and fired it up in vbox), it wouldn't use /boot/userboot.so at all. If you could instead inspect some information from the /boot/loader in the guest, that would seem a more direct way of figuring out which /boot/loader the guest is using.

It looks like you could instead perhaps do something like this:

buf = NULL;
if (stat("/boot/loader", &st) != 0)
    return;
fd = open("/boot/loader", O_RDONLY);
if (fd < 0)
    return;

rdsize = st.st_size;
buf = malloc(rdsize);
if (buf == NULL)
    goto out;
if (read(fd, buf, rdsize) != rdsize)
    goto out;

if (strnstr(buf, "$LuaVersion: Lua", rdsize) != NULL)
    guest_interp = "lua";
else
    guest_interp = "4th";

Coupled with my note below about LOADER_INTERP that means you wouldn't need the existing string search routines.

173 ↗(On Diff #47487)

Can my_interp just be a compile-time thing, something like:

#ifdef LUA
  my_interp = "lua";
#else
  my_interp = "4th";
#endif

? Or it could even be a #define NAME and not a variable at all perhaps?

Looks like we don't have an existing #define that I could see. I would edit loader.mk to add '-DLOADER_INTERP="${LOADER_INTERP}"' to CFLAGS, then you can just use LOADER_INTERP in place of 'my_interp'.

175 ↗(On Diff #47487)

Normal style(9) is to add a blank line before comments (here and a few other places).

usr.sbin/bhyveload/bhyveload.c
594 ↗(On Diff #47487)

Maybe use asprintf() instead of malloc + snprintf?

I'll take a look at these later tonight(-ish) (CDT), thanks!

stand/userboot/userboot/main.c
80 ↗(On Diff #47487)

Right, this is generally strnstr with an added ignorance of NULL terminators, since we're scraping the binary.

stand/userboot/userboot/main.c
80 ↗(On Diff #47487)

Ah, I missed the nul char thing, in that case you want to use 'memmem'. In the example code I gave earlier you would replace the call to strnstr with memmem something like:

#define LUA_SIGNATURE "$LuaVersion: Lua"
...
    if (memmem(buf, rdsize, LUA_SIGNATURE, strlen(LUA_SIGNATURE)) != NULL)
       guest_interp = "lua";

If we needed a signature string for 4th you could use "ficl Version", but I think we are fine to just fall back to 4th by default without checking for an explicit signature.

stand/userboot/userboot/main.c
158 ↗(On Diff #47487)

We don't use /boot/loader with userboot. The userboot.so emulates /boot/loader to create the kernel image that we can then jump to.

So we're using the host's userboot.so to read in the guest's userboot.so to see which version that is so we know whether or not to use ficl or lua so we can load the correct one in bhyveboot so that it can run the scripts that are inside the guest.

But only if the user didn't override on the command line. Otherwise the new userboot.so (being lua) won't be able to boot the old systems (which have no lua).

173 ↗(On Diff #47487)

No. You don't know the interpreter when you are compiling this binary.
You could know it at runtime, but we don't yet publish that.
The interpreter you are using is all down to which interp_xxx.o you link in.

stand/userboot/userboot/main.c
80 ↗(On Diff #47487)

I'd really really rather we look at the version string for the signature, not anything LUA implementation specific. It's the whole reason I changed the version string to include this.

173 ↗(On Diff #47487)

Actually, I take that back. You do know when compiling main.
I generally feel ill-at-ease doing this trick, but here it would be fine.

stand/userboot/userboot/main.c
158 ↗(On Diff #47487)

userboot on the host is emulating the effect of /boot/loader in the running guest. It is using all the same files from the guest's fs that the guest's /boot/loader would use. That is, suppose you take the raw disk image foo.img or what have you, from a bhyve vm and boot it in virtual box (or dd it to a USB stick and boot on bare metal). That will _not_ use the guest's userboot. It will use the guest's /boot/loader (whether it is 4th or lua). The role of userboot on the _host_ is to emulate the effect of /boot/loader in the _guest_. userboot in the _guest_ would only be used if bhyve supported nesting to boot nested guests.

stand/userboot/userboot/main.c
158 ↗(On Diff #47487)

I don't particularly care which we use, but switching it up will require a different version string modifications to include the interpreter. The default interpreter is the default interpreter across the board, so I guess it doesn't really matter- userboot just seemed like the sensible choice at the time given that one operating this setup has direct exposure to userboot already.

stand/userboot/userboot/main.c
158 ↗(On Diff #47487)

So today if I wanted to switch my loader for some reason on an existing system I would do it by modifying the /boot/loader hard link. 'ln -f /boot/loader_4th /boot/loader' or the like. As much as possible, guests should behave the same way. In particular, if booting with EFI which doesn't use user boot, to change a guest your would alter the /boot/loader.efi hardlink just as you would on bare metal. We should permit the same for the non-EFI boot. I think that means that you need to look at /boot/loader in the guest as that is the thing you are emulating.

kevans marked 17 inline comments as done.

Fix semantics and simplify. Notable changes:

common/interp_{4th,lua,simp} now export an $Interpreter mark, which userboot will search for. userboot has been switched to check the guest's /boot/loader for the interpreter to use, rather than the guest's /boot/userboot.so. This also simplifies the search, as $Interpreter: is most likely going to be unique in the resulting binary (as opposed to when we were searching userboot, where the marker would then appear twice).

stand/userboot/userboot/main.c
158 ↗(On Diff #47487)

@imp didn't have time to comment here, but we had a little discussion out-of-band and that was also generally the argument he presented. New version reflects this, and is greatly simplified as a result. =)

This revision is now accepted and ready to land.Aug 31 2018, 4:35 PM
stand/userboot/userboot/main.c
158 ↗(On Diff #47487)

Yea, at first I was resistant to the idea so I wanted some time to think... then I got to thinking about it in the shower this morning and came to much the same conclusion you did: what makes userboot.so in the guest special? Nothing. /boot/loader is what we want to emulate as much as possible. We don't get 100% emulation, but we'll be pretty close.

Looks good to me. I could nit-pick all day at this stuff, but I'd rather get this in and cope with the nits after the release.

Thanks!

stand/userboot/userboot/Makefile
36 ↗(On Diff #47533)

I've sent this to re@ with this line omitted- it was a leftover from some other testing that didn't actually get used.

This revision was automatically updated to reflect the committed changes.