Autogenerate list of ZFS Boot Environments in the beastie menu

Authored by allanjude on Jul 23 2015, 5:06 AM.



Menu does not redraw cleanly after selecting a BE (related to the extra output from read-conf)

Diff Detail

rS FreeBSD src repository
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.
There are a very large number of changes, so older changes are hidden. Show Older Changes
dteske added inline comments.Jul 31 2015, 7:29 PM
174 ↗(On Diff #7540)

Change "0 0 2swap" to instead "0 s>d 2swap"

Reason: The stack input for >number is a double-wide number (the "d" in "ud1" stands for double, indicating that the two leading "cells", aka items, on the stack are a single double-precision number). By changing from "0 0 2swap" to instead "0 s>d 2swap" it is more clear that we are pushing a zero on the stack in the form of a double (HINT: s>d converts from single to double-precision; taking the stack from "0" to "0 0"). This is important for various architectures.

175 ↗(On Diff #7540)

Whoops, stack comments here are wrong. I misspoke last night when I said >number takes 4 items on the stack as input. The initial stack element takes two cells (looks like two items) but is a double-precision value. Used as the "accumulator".

I would change the stack comments from:
( ud1 caddr1 caddr2 u2 -- ud2 caddr1 caddr2 u2 )

to instead the following [accurate] description:
( ud caddr/u -- ud' caddr'/u' )

Please note that it is standard parlance in my existing code to use var and var' (pronounced "var-prime") to indicate when the result on the stack after execution is NOT a new value, but a possibly modified form of the stack input.

In other words, the return caddr is guaranteed to be within u count of the input caddr, making it more appropriate to call it "caddr-prime" as the >number process merely increments caddr in a loop until it hits the first bad character. If all characters in the string were valid, then the return caddr will be caddr+u and u' (pronounced "u-prime") will be zero. Note however, that u' may also be zero for a NULL string input. The same thing happens with the initial "u" input, it is decremented with each step of the internal >number loop until it reaches zero, and thus is a modified "u" or "u-prime" (standard mathematics terminology).

allanjude updated this revision to Diff 7560.Aug 1 2015, 12:21 AM

Update with dteske@'s forth fixes

allanjude marked 5 inline comments as done.Aug 1 2015, 12:22 AM
dteske accepted this revision.Aug 14 2015, 4:29 PM
This revision has a positive review.Aug 14 2015, 4:29 PM

Already approve, but would like to see the summary text changed to the desired commit message prior to commit. A quick blurb in the other fields would be nice too (e.g., test).

Bump (commit plz)

tsoome added a subscriber: tsoome.Dec 12 2015, 7:31 AM
tsoome added inline comments.
158 ↗(On Diff #7540)

you have small issue there; reading in default and local loader.conf will also create new module list and therefore leak current list. I did solve this by creating simple "free-module-options" word, walking through the list and freeing all existing modules.

tsoome added inline comments.Dec 14 2015, 12:44 PM
185 ↗(On Diff #7560)

goto_menu emits true to stack, so line 185 will result in ( -- N TRUE TRUE ) and thats bug.

tsoome added inline comments.Dec 14 2015, 1:40 PM
184 ↗(On Diff #7560)

here you have ascii value for number N in stack (51 for '3'), so instead of hardwired 3, you can use instead:

dup 48 - goto_menu

this would remove the need for hardwired numeric constant.

dteske added inline comments.Dec 14 2015, 11:04 PM
158 ↗(On Diff #7560)

Can you please submit to either allanjude or myself the patch to add free-module-options word? One of us will then update the review patch.

184 ↗(On Diff #7560)

The N on the stack is ASCII decimal for selected menu item. The "3" that you see being passed to goto_menu must be the "3" in "menuset_name3=bootenv".

If I changed this to use "dup 48 -" instead of "3", then it would always load the menu associated with "menuset_nameX" where X is the menu item chosen.

Imagine the implications if the set_be_page word were used as the action for menu item 4. The goto_menu word would try to load the menu associated with menuset_name4. There is no such named-menuset.

The existing hard-wired 3 is correct because we want to specifically (and always) load menuset_name3 which is the bootenv menu.

185 ↗(On Diff #7560)

Good catch. I'll fix this when I update to include your free-module-options word

Can we move some things to menu-commands.4th to unclutter menu.rc?

143 ↗(On Diff #7560)

Can we move this to menu-commands.4th please?

154 ↗(On Diff #7560)

Can we move this to menu-commands.4th please?

170 ↗(On Diff #7560)

Can we move this to menu-commands.4th please?

dteske requested changes to this revision.Dec 14 2015, 11:07 PM
This revision now requires changes to proceed.Dec 14 2015, 11:07 PM

If we can get ahold of the free-module-options by tsoome then I think that actually makes this approach very nice (no edge-cases)

tsoome added inline comments.Dec 15 2015, 8:37 AM
184 ↗(On Diff #7560)

oh right, I misunderstood the background there. thanks for explaining:)

In D3167#96177, @dteske wrote:

If we can get ahold of the free-module-options by tsoome then I think that actually makes this approach very nice (no edge-cases)

I have sent by separate mail.

However, There is still one edge case you probably want to think about - if or how much it will apply; the case is about non-zfs boot where you have no BE's. To solve this, I used guardian construct around menuset creation like this:

\ Boot Environments are (supported) only on ZFS
s" currdev" getenv drop 4 s" zfs:" compare 0= [if]

menuset setup code


The issue is that menu code is generic and therefore will affect scenarios like ufs based installs and cd/usb boot media, where you do not have BE, and loader does not have support commands, so the BE menuset is unusable and will confuse the operator.

Note such guardian is working well in case of illumos where whole BE support is entirely implemented in forth, as we have BE index file in zfs:poolname:/boot/menu.lst maintained by OS beadm command, and to maintain consistent view of BE list, I actually don't even want to read dataset list directly from zfs.

The freebsd case should be covered by same/similar guardian as well, but perhaps I'm missing some context there (currdev versus vfs for example).

smh added a subscriber: smh.Dec 16 2015, 1:14 PM
smh added inline comments.
395 ↗(On Diff #7560)

Guard with #ifdef LOADER_ZFS_SUPPORT?

708 ↗(On Diff #7560)

Avoid assignment and just:

return (zfs_list_dataset(spa, objid));
723 ↗(On Diff #7560)

check the return codes? more below too.

781 ↗(On Diff #7560)

style(9) braces around return.

800 ↗(On Diff #7560)

move up to return early.

816 ↗(On Diff #7560)

style(9) more below too.

if (rv != 0)
        return (rv);
817 ↗(On Diff #7560)

Could the early returns here leave it in an inconsistent state that may course an issue?

820 ↗(On Diff #7560)

use sizeof(envname), more below.

833 ↗(On Diff #7560)

use sizeof(enval)

840 ↗(On Diff #7560)

As zfs_env_index is always incremented could be extracted above so we have the following instead?

if (zfs_env_index > ZFS_BE_LAST)
848 ↗(On Diff #7560)

use sizeof(envname) more below

858 ↗(On Diff #7560)

rv here is likely to be useless due to use by loop above, should that use a seperate var or just (void)unsetenv(envname) ?

1600 ↗(On Diff #7560)

Not new but due to the fact the if does a return, there's no need for the else here.

1903 ↗(On Diff #7560)

No need for else here.

smh requested changes to this revision.Dec 16 2015, 1:15 PM
smh added a reviewer: smh.
smh added inline comments.Dec 16 2015, 1:20 PM
1878 ↗(On Diff #7560)

Assign to err value and return what you go, so we don't loose info about the error + style(9) bool test fix e.g.

err = objset_get_dnode(spa, &spa->spa_mos, objnum, &dataset);
if (err != 0) {
        printf("ZFS: can't find dataset %ju\n", (uintmax_t)objnum);
        return (err);

more below too and the same with dnode_read

allanjude marked 13 inline comments as done.Dec 17 2015, 6:26 AM
allanjude added inline comments.
395 ↗(On Diff #7560)

It is already guarded, the ifdef starts above the previous function (lszfs) and ends just below 'command_reloadbe'

817 ↗(On Diff #7560)

Possibly, switching them to break

allanjude updated this revision to Diff 11365.Dec 17 2015, 6:27 AM
allanjude marked 2 inline comments as done.

Address most of smh@'s feedback

allanjude updated this revision to Diff 11366.Dec 17 2015, 6:30 AM
allanjude marked 3 inline comments as done.

Move forth functions to menu-commands.4th as requested by dteske@

tsoome added inline comments.Dec 17 2015, 11:32 AM
134 ↗(On Diff #11366)

There is actually another small issue, which has simple solution and a bit deeper implications...

The problem is, zfs dataset names can be up to 256 symbols long (more or less), and with long names, the menu box itself will get corrupted; now returning to main menu, only box
contents will get redrawn and the garbage will remain.

for simple workaround, in illumos code I did create the following word:

  1. moved *screen* redraw to separate word:

: be_draw_screen

clear         \ Clear the screen (in screen.4th)
print_version \ print version string (bottom-right; see version.4th)
draw-beastie  \ Draw FreeBSD logo at right (in beastie.4th)
draw-brand    \ Draw brand.4th logo at top (in brand.4th)
menu-init     \ Initialize menu and draw bounding box (in menu.4th)
menu-redraw   \ Redraw menu (in menu.4th)


and to replace this simple "1 goto_menu" with:

: be_back_to_main

1 goto_menu
be_draw_screen  \ return to main menu and then redraw the screen.


so at least the main screen will be cleaned up. In addition, Jeff did propose additional idea, unfortunately it would require more development and is out of scope of this change, but is something to think about: as for different menus there are menusets, the menuset concept could be extended not to involve just menu entries, but "box" itself as well - to define wider and perhaps with different offset box for submenu. for example, for content like BE names, we know we wont fit in width, so we could define also wider box for this set, overlay it on main view, and if needed, apply additional method to "cut" excessive information. overlay covering (partially) logos and such is okay, as going back to main menu would redraw the screen.

dteske added inline comments.Dec 17 2015, 5:38 PM
404 ↗(On Diff #11366)

Previously pointed out by tsoome, and I agree, we need to kill this line (TRUE) because it leaks an extra TRUE onto the stack (goto_menu already throws a TRUE on the stack)

dteske added inline comments.Dec 17 2015, 5:42 PM
134 ↗(On Diff #11366)

I've been holding back a redesign that would address this. I was having issues finding sympathetic parties, such as yourself, that want to solve this issue like myself. Most commonly, say at conferences, I was getting push-back from other developers that were loath to see yet another redesign (not appreciating the technical merit behind such a redesign; specifically to address the stated issue of menu overflow leading to screen garbage).

I got support from another interested party just moments ago (Shawn Webb; aka lattera) and am going to move forward with the redesign that I have envisaged over a year ago to address this issue.

I'll post mock-ups in the next 48 hours.

allanjude marked 9 inline comments as done.Dec 25 2015, 5:29 PM
allanjude added inline comments.
158 ↗(On Diff #7560)

I have pulled in free-module-options from tsoome (thanks again). Minor changes to it since IllumOS extended the modules structure

185 ↗(On Diff #7560)

both of these are fixed

allanjude updated this revision to Diff 11688.Dec 25 2015, 5:33 PM
allanjude marked 2 inline comments as done.

Updated with fixes from tsoome

Includes simplified menu layout a bit more like IllumOS

forth stuff:

remove excess 'true' in set_be_page

allanjude updated this revision to Diff 11810.Dec 31 2015, 5:12 AM

Update diff to not populate the BE list if loader.conf bootenv_autolist="NO"

allanjude updated this revision to Diff 11811.Dec 31 2015, 5:51 AM

Do not evaluate "unload", call it directly

allanjude updated this revision to Diff 11813.Dec 31 2015, 6:21 AM

More forth nonsense ;)

tsoome added inline comments.Dec 31 2015, 8:45 AM
1049 ↗(On Diff #11813)

actually the comment on this line was correct one - this strdup does introduce memory leak, and it actually is unneeded, file name produced by this word will be consumed by fopen and therefore strget is just enough to produce addr len pair into the stack.

allanjude added inline comments.Dec 31 2015, 5:27 PM
1049 ↗(On Diff #11813)

Sorry for the confusion. This is an unrelated change that Phabricator is mis-identifying. I should have done a separate update to Phabricator when I rebased my tree on the latest -current

This change is part of rS292899 and unrelated to the ZFS BE menu work.

allanjude updated this revision to Diff 11822.Dec 31 2015, 5:32 PM

Avoid double-redraw of the menu in be_back_to_main

allanjude updated this revision to Diff 11823.Dec 31 2015, 5:35 PM

Remove stray newlines

allanjude updated this revision to Diff 11824.Dec 31 2015, 5:50 PM

Guard against currdev not being set (rarely happens, but lets not make the menu puke if it does)

allanjude updated this revision to Diff 11825.Dec 31 2015, 6:02 PM

Eliminate be_back_to_main

allanjude updated this revision to Diff 11827.Dec 31 2015, 7:30 PM
allanjude marked 5 inline comments as done.

Final fixes

I think this is ready for commit now

dteske added inline comments.Dec 31 2015, 7:35 PM
393 ↗(On Diff #11825)

The comment at the end of this line appears out of place and I would remove it to make this menu item action function more like the others in this file.

136 ↗(On Diff #11825)

I'm pushing the redesign out to the future (pre-11-R) where we can phab it up in an unrelated change.

With respect to the menu redraw, I worked with Allan to fine-tune the be_draw_menu function. We removed menu-redraw from be_draw_menu because using be_draw_menu in conjunction with goto_menu would cause a double-redraw of the menu, wherein first it would erase the current contents, redraw the current contents, then erase the entire screen, redraw the screen, and finally draw the desired menu. By moving menu-redraw out of be_draw_menu (and subsequently adding menu-redraw to set_bootenv) we avoid the double-redraw with stale menu data.

It's worth noting that at 9600 baud on a serial line, this double-redraw is "in your face" since it takes ~4.95s to draw the entire screen at said baudrate.

1049 ↗(On Diff #11825)

You're right. I fixed the actual leak in SVN r292999

allanjude updated this revision to Diff 11828.Dec 31 2015, 7:36 PM
allanjude marked 3 inline comments as done.

Resync with -current

allanjude updated this revision to Diff 11829.Dec 31 2015, 7:39 PM

Rebase again

dteske accepted this revision.Dec 31 2015, 7:47 PM

w00t x infinity

This revision was automatically updated to reflect the committed changes.