Page MenuHomeFreeBSD

geom: add feature: gconcat online append
Needs ReviewPublic

Authored by noah.bergbauer_tum.de on Feb 17 2019, 7:49 PM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

Add online append feature to gconcat (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=235473).

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

There are some style bugs in the diff, but I'll hold off on commenting on them for the initial review. style(9) is a useful reference if you're not familiar with it.

Did you consider changing sc_lock into an sx lock instead of adding a second lock? Also, out of curiosity, what's your use case for this functionality?

lib/geom/concat/geom_concat.c
68

These are sorted alphabetically, so let's move this to the beginning of the list.

sys/geom/concat/g_concat.c
578

I think sc_lock_append should come first in the lock order. g_concat_dumpconf() would violate that order, but you can fix that by requiring both the topology lock and sc_lock_append in order to modify the disk list. Are there any other places you're assuming that the topology lock comes first in the lock order?

1121

I'd prefer to keep the disk structures on a linked list. gmirror does this, for example. The approach of realloc()ing the array and fixing up pointers is fragile, IMO.

1151

The body of this loop should probably be its own subroutine.

1155

Providers may have multiple names. For instance, /dev/ada2p1 and /dev/gptid/480b0835-ee38-11e7-81c5-001b785db800 refer to the same provider on my system. Hardcoded names ensure that the tasting process ignores these kinds of aliases.

sys/geom/concat/g_concat.h
46

This version number is for the metadata structure, which doesn't appear to have been updated by this diff.

sys/geom/concat/g_concat.c
578

I'm not sure how to accomplish this? AFAICS the topology lock is already acquired when these functions are called (acquired outside of this module even) so I don't see how I can reverse this order.

Basically what happend is I implemented the thing and then WITNESS complained about this lock order reversal with topology. So suddenly there are lock order issues that I didn't anticipate at all but like this it seems to be correct - at least from my (very limited) viewpoint.

You're right, this should be a linked list. The only reason I went with the reallocation approach is that I wanted to touch as little code as possible to avoid new bugs. What I have now is a compromise: it leaves the old logic (allocate all list elements at creation, some of them are just invalid) untouched but it's still a list so append doesn't have to reallocate and fix pointers. I chose a TAILQ because the code assumes that the list is ordered so append has to insert at the tail.

I don't think changing sc_lock makes sense since it (AFAICS) has the very specific purpose of protecting cross-part-boundary IOs (should rarely happen anyways) which is completely orthogonal to the append functionality.

I think everything in your first review should be addressed - except for the lock order of course, I still need help with that.

The backstory is rather simple, I don't want my server to go offline just because I'm growing the filesystem. Ever since Linux LVM showed me how easy partition management can be I hate working without this kind of flexibility and I was really happy to see that FreeBSD can replicate basically all of LVM's core features with the right combination of geoms - with just a few exceptions, online append being one of them.
I guess gvinum was at some point intended to solve all of this but after hitting two kernel panics during basic tests I gave up on using it. The bugs I filed still don't have a single response so I guess it's just old, unmaintained and dead.

noah.bergbauer_tum.de added inline comments.
sys/geom/concat/g_concat.c
253

I think not returning here was a bug?

You're right, this should be a linked list. The only reason I went with the reallocation approach is that I wanted to touch as little code as possible to avoid new bugs.

Feel free to break the change into multiple patches. The change to use a TAILQ could certainly be committed on its own as a "no functional change" diff.

What I have now is a compromise: it leaves the old logic (allocate all list elements at creation, some of them are just invalid) untouched but it's still a list so append doesn't have to reallocate and fix pointers. I chose a TAILQ because the code assumes that the list is ordered so append has to insert at the tail.

I don't think changing sc_lock makes sense since it (AFAICS) has the very specific purpose of protecting cross-part-boundary IOs (should rarely happen anyways) which is completely orthogonal to the append functionality.

Ok. I need to review the existing locking in gconcat.

I think everything in your first review should be addressed - except for the lock order of course, I still need help with that.

The backstory is rather simple, I don't want my server to go offline just because I'm growing the filesystem. Ever since Linux LVM showed me how easy partition management can be I hate working without this kind of flexibility and I was really happy to see that FreeBSD can replicate basically all of LVM's core features with the right combination of geoms - with just a few exceptions, online append being one of them.
I guess gvinum was at some point intended to solve all of this but after hitting two kernel panics during basic tests I gave up on using it. The bugs I filed still don't have a single response so I guess it's just old, unmaintained and dead.

Yeah, gvinum is effectively dead AFAIK. I have not seen any work done on it in a long time.

sys/geom/concat/g_concat.c
253

Yes, I'll go ahead and commit that now. Thanks.

If you're interested in using gconcat as a kernel dump device, I found another bug some time ago :( https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225120

474

You can get rid of this variable and just test disk == TAILQ_FIRST(&sc->sc_disks) above.

sys/geom/concat/g_concat.c
539

Style: the opening brace should be on the previous line.

544

disk can be initialized in the for loop header.

549

Style: the opening brace should be on the previous line.

578

One issue is that the disk pointer may have been invalidated while the lock was dropped. That said, I don't think you need to look up disk until after this point?

659

Might as well just delete this line.

712–713

You can avoid the temp variable and write:

while ((disk = TAILQ_FIRST(&sc->sc_disks)) != NULL) {
    TAILQ_REMOVE(&sc->sc_disks, disk, d_next);
    free(disk, M_CONCAT);
}
1043

Style: continuation lines should be indented by four spaces.

1174

sc->sc_ndisks++;

sys/geom/concat/g_concat.h
91

Let's rename sc_lock to sc_completion_lock and sc_lock_append to sc_disks_lock.

Feel free to break the change into multiple patches. The change to use a TAILQ could certainly be committed on its own as a "no functional change" diff.

Not sure how to do this? My current workflow is copy-pasting the output of svnlite diff into phabricator. I'm not aware of a convenient way to split up diffs like this.

sys/geom/concat/g_concat.c
578

Why though? List elements are never removed so the pointer should remain valid?

Feel free to break the change into multiple patches. The change to use a TAILQ could certainly be committed on its own as a "no functional change" diff.

Not sure how to do this? My current workflow is copy-pasting the output of svnlite diff into phabricator. I'm not aware of a convenient way to split up diffs like this.

git makes it much easier to manage patch sets. Many developers use the git mirror at github.com/freebsd/freebsd. With svn it's hard to do anything other than create monolithic patches.

Anyway, I don't have any strong preferences one way or the other. My point is just that your changes provide a fairly major extension in functionality to gconcat, so I don't think you should attempt to minimize the diff to touch as few areas as possible.

sys/geom/concat/g_concat.c
578

Fair enough, though that invariant is not actually enforced anywhere and could become false in the future.

Let's just leave the locking here as it is in the patch for now. I am mostly worried about corner-case race conditions that typically arise when code drops a lock in order to avoid a lock order reversal. There has been a number of subtle bugs in gmirror that arose this way, so I tend to be suspicious of that pattern.

What is the status of this? Do you need me to split the changes? Would I need to create an entirely separate phabricator diff then?