Page MenuHomeFreeBSD

allow creation of non-recursive boot environments (libbe)
ClosedPublic

Authored by rew on Dec 14 2018, 10:19 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Mar 19, 3:59 PM
Unknown Object (File)
Tue, Mar 19, 2:35 PM
Unknown Object (File)
Tue, Mar 19, 8:44 AM
Unknown Object (File)
Tue, Mar 19, 8:44 AM
Unknown Object (File)
Tue, Mar 19, 8:44 AM
Unknown Object (File)
Tue, Mar 19, 8:44 AM
Unknown Object (File)
Tue, Mar 19, 8:44 AM
Unknown Object (File)
Tue, Mar 19, 8:44 AM
Subscribers

Details

Summary

be_deep_clone() unconditionally clones all child datasets of a given boot environment
(e.g. zroot/ROOT/default). Also, libbe doesn't provide an API that allows a user to
create a non-recursive boot environment.

These changes implement an API function named _be_create(), which provides the ability to
create a recursive or non-recursive boot environment. The existing be_create* functions
are now proxied through _be_create(), their current behavior is unchanged. One of the
problems with the existing be_create* functions is that they don't expose any notion
that boot environments can be created recursively or not.

The logic used to create a recursive boot environment has been moved from be_deep_clone()
and placed in _be_create().

_be_create() also acquired the logic that was in be_create_from_existing_snap().

This code still needs some massaging. For example, _be_create() should be named something
else. The obvious name is already taken and can't be changed or the existing API will break.

working off r342069

Test Plan

provided in lib/libbe/tests

  1. create a test zpool and zroot/ROOT with child datasets
  2. create recursive bootenv, check existence of child datsets
  3. create non-recursive bootenv, check non-existence of child datasets

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Just a heads up- I'm letting this one stew a little bit. I'm a big +1 for the refactoring bits, but I'm going to refrain from providing review while I think about how to fix the API. Small brain dump so that Allan or yourself (being the one that's identified the need for a way to do non-recursive creation) can pitch in to my inner monologue:

On one hand, this feels like a major oversight and I wish I would have caught it to make an explicit decision before it appeared in 12.0.

On the other hand, I can kind of see why it is the way it is. The current incarnation will basically do the right thing for 90% of use-cases and it will do the right for the only in tree consumer. My initial thought is that explicitly non-recursive creation is likely the outlier and might warrant just adding a be_create_shallow to fill this gap.

I'm torn between adding a be_create_shallow and adding an API instead to just do arbitrary depth instead of playing a guessing game of what libbe(3) user is wanting to do -- I can't imagine too many scenarios where something in-between would be useful, but wanting to graft other parts onto a new environment past a certain level (e.g. wants up to /usr/local to be part of the boot environment by default, but some parts past /usr/local get special treatment) seems like a reasonable use-case. I think the default be_create is OK enough for most uses, though. Thoughts?

(Edited to clarify up to /usr/local by *default* - we'll assume their intention is for /usr/local to always be there, but perhaps /usr/local/etc gets grafted on from a common source outside of the boot environment for bootstrap but needs to be part of the environment for testing/changing and not breaking the known-working common bits)

I agree, the default be_create is adequate for most cases. If anything, I would lean
towards a be_create_shallow, here's my thinking. be_create is, effectively,
a recursive snapshot that gets cloned - which is consistent with the zfs
behavior of 'zfs snapshot -r'. be_create_shallow would be consistent with
a 'zfs snapshot', minus the recursive flag.

Not to digress from libbe but, to show how bectl uses libbe for creating
boot environments. bectl advertises that it can create recursive and
non-recursive boot environments but, that doesn't seem to be the case.
This is what I'm talking about, consider the following.

Given the boot environment:

zroot/ROOT/default
zroot/ROOT/default/usr
zroot/ROOT/default/usr/obj
zroot/ROOT/default/usr/src

bectl create -e default new

/* create a non-recursive boot environment */

zfs list

/* we end up with the following */
...
zroot/ROOT/new
zroot/ROOT/new/usr
zroot/ROOT/new/usr/obj
zroot/ROOT/new/usr/src
...

I would have expected only 'zroot/ROOT/new' to be created.

To complete the example:

bectl create -r -e default newer

/* create a recursive boot environment */

zfs list

...
zroot/ROOT/newer
zroot/ROOT/newer/usr
zroot/ROOT/newer/usr/obj
zroot/ROOT/newer/usr/src
...

That's expected.

Since libbe doesn't provide a way to create non-recursive boot environments,
neither does bectl. I also think a be_create_shallow would fill this gap. bectl
would use be_create_shallow to create a non-recursive boot environment.

At any rate, that's my two cents. Is my understanding of recursive/non-recursive
boot environments flawed?

To wander off on a tangent, could 'be_add_child' be leveraged to provide
the grafting scenarios you're describing? I glanced over the
be_add_child code, which looks experimental.

In D18564#398029, @rob.fx907_gmail.com wrote:

I agree, the default be_create is adequate for most cases. If anything, I would lean
towards a be_create_shallow, here's my thinking. be_create is, effectively,
a recursive snapshot that gets cloned - which is consistent with the zfs
behavior of 'zfs snapshot -r'. be_create_shallow would be consistent with
a 'zfs snapshot', minus the recursive flag.

Not to digress from libbe but, to show how bectl uses libbe for creating
boot environments. bectl advertises that it can create recursive and
non-recursive boot environments but, that doesn't seem to be the case.
This is what I'm talking about, consider the following.

As an aside- such digressions are fine in this context. Framing things in terms of a consumer or theoretical consumer certainly helps get the point across.

Given the boot environment:

zroot/ROOT/default
zroot/ROOT/default/usr
zroot/ROOT/default/usr/obj
zroot/ROOT/default/usr/src

bectl create -e default new

/* create a non-recursive boot environment */

zfs list

/* we end up with the following */
...
zroot/ROOT/new
zroot/ROOT/new/usr
zroot/ROOT/new/usr/obj
zroot/ROOT/new/usr/src
...

I would have expected only 'zroot/ROOT/new' to be created.

To complete the example:

bectl create -r -e default newer

/* create a recursive boot environment */

zfs list

...
zroot/ROOT/newer
zroot/ROOT/newer/usr
zroot/ROOT/newer/usr/obj
zroot/ROOT/newer/usr/src
...

That's expected.

Since libbe doesn't provide a way to create non-recursive boot environments,
neither does bectl. I also think a be_create_shallow would fill this gap. bectl
would use be_create_shallow to create a non-recursive boot environment.

Right, so, it does sound like we're on similar pages. IMO- all of this should have been recursive by default as far as bectl/libbe are concerned. If you've setup a BE with children, you almost certainly want them included in most operations and most would find it a surprise the way it is now, I would think. They're not deployed very widely as far as I know, though.

At any rate, that's my two cents. Is my understanding of recursive/non-recursive
boot environments flawed?

Nope, that's how it works. =)

To wander off on a tangent, could 'be_add_child' be leveraged to provide
the grafting scenarios you're describing? I glanced over the
be_add_child code, which looks experimental.

Indeed, and I've got a branch that's fixing up the bectl add and be_add_child functionality that I'll hopefully commit in about a week and a half's time. This is the expected way to, say, convert an existing shallow setup to a deep setup. It has some caveats in it that I've needed to think through, though, such as how to make it actually work when the dataset exists and is non-empty. It's a WIP. =)

I think be_create_shallow is the way forward for this review, though.

It's probably poor style to change the user interface for bectl but, it's
already recursive by default - could always ignore the '-r' flag and add a flag
for shallow creation. Upside here is the current behavior remains unchanged.
Downside being, confusion for current users (i.e. the shell game of
command-line flags).

Going forward with be_create_shallow...

What do you think about renaming '_be_create' to 'be_clone' and declare it as a
static function. Then all the be_create* (including be_create_shallow) functions
would be proxied through 'be_clone'. The problem here is, the function
'be_deep_clone' name would be mis-leading since it no longer performs a true deep
clone (i.e. it would only be cloning a single dataset).

Some things to note is 'be_create_shallow' will need to share code with
'be_create_from_existing_snap' and 'be_deep_clone'. Currently, all be_create*
functions are proxied through 'be_create_from_existing_snap'.

The main problems I enountered are:

'be_create_from_existing_snap' can't be changed (it's already in the API).

'be_deep_clone' is where the recursive boot environment creation is
happening via 'zfs_iter_filesystems'.

'zfs_iter_filesystems' is using 'be_deep_clone' as a callback to clone the
child datasets. The 'zfs_iter_filesystems' callback function is expected to
have a signature such as: int callback(zfs_handle_t *, void *). Which means we
can't pass a parameter to to be_deep_clone to optionally call
'zfs_iter_filesystems'.

other approaches:

  1. create a boolean property for shallow creation in the 'libbe_deep_clone' struct and check that property in 'be_deep_clone()' to decide whether child datasets should be cloned or not. To implement this, the be_create* functions would create a 'libbe_deep_clone' struct in accordance to how the boot environment should be created (i.e. bootenv name, bootenv root, from_snapshot, is_shallow). This would require a function that accepts a 'libbe_deep_clone' struct as an argument and creates a boot environment from it. This approach feels like it would go against the current design of libbe.
  1. factor shared code out of 'be_create_existing_snap' and 'be_deep_clone' so 'be_create_shallow' can use it.

I would like to continue to tackle this problem if you have time provide guidance/suggestions. If not, no worries.


At any rate, thanks for the feedback!

In D18564#398209, @rob.fx907_gmail.com wrote:

It's probably poor style to change the user interface for bectl but, it's
already recursive by default - could always ignore the '-r' flag and add a flag
for shallow creation. Upside here is the current behavior remains unchanged.
Downside being, confusion for current users (i.e. the shell game of
command-line flags).

Right, that seems reasonable. I guess with lack of example from other ZFS things, -R perhaps for shallow as the logical negation of recursive?

Going forward with be_create_shallow...

What do you think about renaming '_be_create' to 'be_clone' and declare it as a
static function. Then all the be_create* (including be_create_shallow) functions
would be proxied through 'be_clone'. The problem here is, the function
'be_deep_clone' name would be mis-leading since it no longer performs a true deep
clone (i.e. it would only be cloning a single dataset).

Some things to note is 'be_create_shallow' will need to share code with
'be_create_from_existing_snap' and 'be_deep_clone'. Currently, all be_create*
functions are proxied through 'be_create_from_existing_snap'.

The main problems I enountered are:

'be_create_from_existing_snap' can't be changed (it's already in the API).

'be_deep_clone' is where the recursive boot environment creation is
happening via 'zfs_iter_filesystems'.

'zfs_iter_filesystems' is using 'be_deep_clone' as a callback to clone the
child datasets. The 'zfs_iter_filesystems' callback function is expected to
have a signature such as: int callback(zfs_handle_t *, void *). Which means we
can't pass a parameter to to be_deep_clone to optionally call
'zfs_iter_filesystems'.

other approaches:

  1. create a boolean property for shallow creation in the 'libbe_deep_clone' struct and check that property in 'be_deep_clone()' to decide whether child datasets should be cloned or not. To implement this, the be_create* functions would create a 'libbe_deep_clone' struct in accordance to how the boot environment should be created (i.e. bootenv name, bootenv root, from_snapshot, is_shallow). This would require a function that accepts a 'libbe_deep_clone' struct as an argument and creates a boot environment from it. This approach feels like it would go against the current design of libbe.
  1. factor shared code out of 'be_create_existing_snap' and 'be_deep_clone' so 'be_create_shallow' can use it.

I would like to continue to tackle this problem if you have time provide guidance/suggestions. If not, no worries.

I'll get back to you on the rest of this when I've returned in a couple of days. Flew south for the winter last Wednesday, and will be returning in Wednesday/Thursday timeframe.

Do you use IRC or anything of the sort? Some of this might be easier to hash out in real-time, and we can summarize the results of said discussion back in this review.

In D18564#398209, @rob.fx907_gmail.com wrote:

Going forward with be_create_shallow...

What do you think about renaming '_be_create' to 'be_clone' and declare it as a
static function. Then all the be_create* (including be_create_shallow) functions
would be proxied through 'be_clone'. The problem here is, the function
'be_deep_clone' name would be mis-leading since it no longer performs a true deep
clone (i.e. it would only be cloning a single dataset).

This sounds reasonable -- be_deep_clone should be renamed, it's local to this compilation unit and has no effect on consumers.

Some things to note is 'be_create_shallow' will need to share code with
'be_create_from_existing_snap' and 'be_deep_clone'. Currently, all be_create*
functions are proxied through 'be_create_from_existing_snap'.

The main problems I enountered are:

'be_create_from_existing_snap' can't be changed (it's already in the API).

I don't follow where the problem is here. It sounds like you were only talking about refactoring a bit to eliminate potential duplication, as long as the symbol name/arguments/behavior all remain the same there is no problem.

'be_deep_clone' is where the recursive boot environment creation is
happening via 'zfs_iter_filesystems'.

Right, that can all change; it's internal to the library and requires no change to the public interface.

'zfs_iter_filesystems' is using 'be_deep_clone' as a callback to clone the
child datasets. The 'zfs_iter_filesystems' callback function is expected to
have a signature such as: int callback(zfs_handle_t *, void *). Which means we
can't pass a parameter to to be_deep_clone to optionally call
'zfs_iter_filesystems'.

That second parameter you pass in some kind of struct containing whatever data you want to make available to the callback, and the callback can just assume that it's being passed a pointer to a struct of that type since it's all internal usage. That's the sole purpose of struct libbe_deep_clone -- passing context in to the callback, which would include your recursive parameter. You'd set this context up even if you're calling the callback directly.

other approaches:

  1. create a boolean property for shallow creation in the 'libbe_deep_clone' struct and check that property in 'be_deep_clone()' to decide whether child datasets should be cloned or not. To implement this, the be_create* functions would create a 'libbe_deep_clone' struct in accordance to how the boot environment should be created (i.e. bootenv name, bootenv root, from_snapshot, is_shallow). This would require a function that accepts a 'libbe_deep_clone' struct as an argument and creates a boot environment from it. This approach feels like it would go against the current design of libbe.

Keep in mind that libbe_deep_clone is for internal usage only; that header isn't installed, we can hide whatever ugliness we want to in there. =)

  1. factor shared code out of 'be_create_existing_snap' and 'be_deep_clone' so 'be_create_shallow' can use it.

This is also a reasonable approach. I don't think you can go wrong either way, to be honest. I don't know how widely applicable this will be after this particular problem's solved, and we can always elect to take one approach and revise later if we decide we've made a terrible mistake. I'm not biased one way or the other, though- libbe_deep_clone already contains the parameters for a clone, adding a 'shallow' flag wouldn't feel completely dirty.

Do you use IRC or anything of the sort? Some of this might be easier to hash out in real-time, and we can summarize the results of said discussion back in this review.

Yep, where do you hang out?

Sounds like the approach of renaming '_be_create' and 'be_deep_clone' is an acceptable solution.

You're summary/breakdown of the alternative approaches was helpful as well.

In D18564#399991, @rob.fx907_gmail.com wrote:

Do you use IRC or anything of the sort? Some of this might be easier to hash out in real-time, and we can summarize the results of said discussion back in this review.

Yep, where do you hang out?

I'm kevans91 on Freenode (#freebsd, mainly) as well as on EFNet (#bsdmips, #freebsd-wifi, and various other dev-type channels).

Sounds like the approach of renaming '_be_create' and 'be_deep_clone' is an acceptable solution.

You're summary/breakdown of the alternative approaches was helpful as well.

Hi,

Any progress/hangups on this? =)

Hey Kyle,

no hang-ups, been away from my development machine - back in town now.

I'm going to polish what I have and update this revision before the weekend is out.

Updating D18564: allow creation of non-recursive boot environments (libbe)

Added an API function ('be_create_shallow') for creating a non-recursive boot environment.

All be_create* functions are proxied through 'be_clone'. This function creates a boot environment.
'be_clone' accepts 4 function parameters:

  1. libbe context handle
  2. name of boot environment to be created
  3. the snapshot that the boot environment will be cloned from
  4. a boolean, create recursive clone of the snapshot? (i.e. a recursive boot environment)

Added a function named 'be_get_path' which constructs the dataset path for boot environment to be created.
For example, given a dataset 'zroot/ROOT/default/usr' and a boot environment named 'bootenv'. The result
from 'be_get_path' would be 'zroot/ROOT/bootenv/usr'.
'be_get_path' accepts 4 function parameters:

  1. struct libbe_deep_clone
  2. dataset path of the snapshot we will clone from
  3. character buffer for the result
  4. size of buffer

Renamed 'be_deep_clone' to 'be_clone_cb'.

Existing API functions functionality unchanged.

kevans added inline comments.
lib/libbe/be.c
448 ↗(On Diff #54216)

Return value should be parenthesized

460 ↗(On Diff #54216)

Return value should be parenthesized

587 ↗(On Diff #54216)

I would go ahead and kick err out and move be_clone into the return expression for these two functions, since we're not otherwise using the return value and it doesn't introduce an overly long line.

lib/libbe/be_impl.h
53 ↗(On Diff #54216)

Go ahead and make that a 'bool'

lib/libbe/tests/be_create.sh
6 ↗(On Diff #54216)

Unless you want to, i'm going to apply what @ngie did in rS344067 to this before committing.

Updating D18564: allow creation of non-recursive boot environments (libbe)

  • Import changes from https://reviews.freebsd.org/rS344067
  • Fixed type declaration (bool) for libbe_deep_clone->recursive. Removed 'err' from be_create_shallow and be_create_existing_from_snap. Added parens around the return expression in be_get_path.
rew marked 4 inline comments as done.Feb 24 2019, 10:40 AM

A question about the added files - am I supposed to include the copyright/license information or...?

In D18564#413473, @rob.fx907_gmail.com wrote:

A question about the added files - am I supposed to include the copyright/license information or...?

Ah, yes, please do so -- including the $FreeBSD$ comment in the Makefile (preferably followed by a blank line) and the copyright header for target_prog.c including the sys/cdefs.h #include and __FBSDID("$FreeBSD$");

Updating D18564: allow creation of non-recursive boot environments (libbe)

  • added license/copyright header information
  • added documentation for be_create_shallow
  • minor libbe(3) man changes for previous be_create* functions.

I'm not sure why the history is clobbered for this updated diff revision with regards to target_prog.c being copied from be_impl.h.

If you have a suggestion on how to fix or if it's not a big deal, let me know.

That's odd, but not a problem -- I'll do a final pass over this tomorrow and commit, but it looks good at a glance.

lib/libbe/tests/be_create.sh
79 ↗(On Diff #54713)

This should be a hard tab.

127 ↗(On Diff #54713)

-s exit:0 is the default.

I guess while you're addressing those, you can go ahead and upload a version that has the mtree addition with it, too. =-) ^/etc/mtree/BSD.tests.dist just needs the two-line addition to make test installation work out.

Sorry to arrive so late to this review.

In D18564#398209, @rob.fx907_gmail.com wrote:

It's probably poor style to change the user interface for bectl but, it's
already recursive by default - could always ignore the '-r' flag and add a flag
for shallow creation. Upside here is the current behavior remains unchanged.
Downside being, confusion for current users (i.e. the shell game of
command-line flags).

ZFS uses '-d #' to limit the depth of recursion for subcommands like 'list', with 0 being no recursion. This might fit well. It also would allow for a possibly nicer API:

be_create_depth(...existing args..., depth);

and be_create() just becomes a wrapper with depth=-1 (unlimited)

In an slightly unrelated naming note, ZFS often uses 'static foo_impl()' to be the 'guts' of foo() and foo_different()

A possible note for the future: If we are doing recursive cloning, rather than doing zfs_iter_filesystems() we might consider using ZFS Channel Programs (tiny lua scripts), so that all of the clones would be created as a single transaction. In this case, since they are created from a recursive snapshot, there isn't a concern about atomicity, but it is what channel programs were invented for.

By the way: These are just notes, I am not requesting that you change anything.

The channel programs look like a cool feature - I have a few questions about it:

  1. Where would these .zcp scripts live?
  2. Would they be consumed by libbe, bectl or ...?
  3. According to zfs-program(1M), cloning isn't available yet, any guess when it may be implemented?

Earlier in the review, Kyle suggested arbitrary depth creation as well. I'm not opposed to modifying this patch for depth creation if that's the preferred route. I think the use-cases for shallow boot environments is a short list - with arbitrary depth boot environments being an even shorter list. At some point the end-user will have to add/remove a dataset within the boot environment to get their 'perfect' set-up.

lib/libbe/tests/be_create.sh
127 ↗(On Diff #54713)

I see only line 127 is marked but, I presume that implies to remove all occurrences where the default '-s exit:0' is found?

Also, since '-o empty' is also the default, should these be removed as well?

rew marked 3 inline comments as done.

Updating D18564: allow creation of non-recursive boot environments (libbe)

  • removed default flags from atf tests.
  • mtree addition in BSD.tests.dist for libbe
In D18564#417318, @rob.fx907_gmail.com wrote:

The channel programs look like a cool feature - I have a few questions about it:

  1. Where would these .zcp scripts live?
  2. Would they be consumed by libbe, bectl or ...?
  3. According to zfs-program(1M), cloning isn't available yet, any guess when it may be implemented?

Earlier in the review, Kyle suggested arbitrary depth creation as well. I'm not opposed to modifying this patch for depth creation if that's the preferred route. I think the use-cases for shallow boot environments is a short list - with arbitrary depth boot environments being an even shorter list. At some point the end-user will have to add/remove a dataset within the boot environment to get their 'perfect' set-up.

Go ahead and add arbitrary depth to this libbe backend, with depth=-1 meaning 'unlimited'. I'd like to get this wrapped up and pushed out the door in time to MFC all of libbe/bectl before 11.3.

Sounds good, I will do that. You can expect to see an updated revision in the next few days.

Updating D18564: allow creation of non-recursive boot environments (libbe)

  • implement arbitrary depth creation of a boot environment
  • fix compiler warning for an implicit declaration on be_create_depth() when building 'target_prog'

Arbitrary depth creation of a boot environment can be done by using be_create_depth(). The libbe.3
documentation has been updated for be_create_depth(). be_create_depth() is a proxy to libbe's
be_clone(). An additional test has been added to test for single-depth creation of a boot environment.

In libbe/tests/Makefile, I added the compiler flag '-I${SRCTOP}/lib/libbe'. Without this flag, the
compiler will find '/usr/include/be.h', which could be out-of-sync and result in a compiler warning
for implicit declaration: be_create_depth(). This happens when building libbe+test before the newly
built libbe is installed to the system.

Do you think be_create_shallow() should be removed? It seems a bit redundant to keep it around since
the same thing can be accomplished by passing '0' to be_create_depth().

In D18564#429063, @rob.fx907_gmail.com wrote:

Updating D18564: allow creation of non-recursive boot environments (libbe)

  • implement arbitrary depth creation of a boot environment
  • fix compiler warning for an implicit declaration on be_create_depth() when building 'target_prog'

Arbitrary depth creation of a boot environment can be done by using be_create_depth(). The libbe.3
documentation has been updated for be_create_depth(). be_create_depth() is a proxy to libbe's
be_clone(). An additional test has been added to test for single-depth creation of a boot environment.

In libbe/tests/Makefile, I added the compiler flag '-I${SRCTOP}/lib/libbe'. Without this flag, the
compiler will find '/usr/include/be.h', which could be out-of-sync and result in a compiler warning
for implicit declaration: be_create_depth(). This happens when building libbe+test before the newly
built libbe is installed to the system.

Do you think be_create_shallow() should be removed? It seems a bit redundant to keep it around since
the same thing can be accomplished by passing '0' to be_create_depth().

Yeah, good point- let's go ahead and drop it.

Updating D18564: allow creation of non-recursive boot environments (libbe)

  • removed be_create_shallow()

I think this looks good now, thanks! A minor man page nit that I can fix up pre-commit: new sentences always start on a new line.

@allanjude Any other concerns?

This revision is now accepted and ready to land.Apr 20 2019, 2:34 AM
This revision was automatically updated to reflect the committed changes.