Add online append feature to gconcat (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=235473).
Details
- Reviewers
- None
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
There are some style bugs in the diff, but I'll hold off on commenting on them for the initial review. style(9) is a useful reference if you're not familiar with it.
Did you consider changing sc_lock into an sx lock instead of adding a second lock? Also, out of curiosity, what's your use case for this functionality?
lib/geom/concat/geom_concat.c | ||
---|---|---|
69 | These are sorted alphabetically, so let's move this to the beginning of the list. | |
sys/geom/concat/g_concat.c | ||
583 | I think sc_lock_append should come first in the lock order. g_concat_dumpconf() would violate that order, but you can fix that by requiring both the topology lock and sc_lock_append in order to modify the disk list. Are there any other places you're assuming that the topology lock comes first in the lock order? | |
1127 | I'd prefer to keep the disk structures on a linked list. gmirror does this, for example. The approach of realloc()ing the array and fixing up pointers is fragile, IMO. | |
1157 | The body of this loop should probably be its own subroutine. | |
1161 | Providers may have multiple names. For instance, /dev/ada2p1 and /dev/gptid/480b0835-ee38-11e7-81c5-001b785db800 refer to the same provider on my system. Hardcoded names ensure that the tasting process ignores these kinds of aliases. | |
sys/geom/concat/g_concat.h | ||
46 | This version number is for the metadata structure, which doesn't appear to have been updated by this diff. |
sys/geom/concat/g_concat.c | ||
---|---|---|
583 | I'm not sure how to accomplish this? AFAICS the topology lock is already acquired when these functions are called (acquired outside of this module even) so I don't see how I can reverse this order. Basically what happend is I implemented the thing and then WITNESS complained about this lock order reversal with topology. So suddenly there are lock order issues that I didn't anticipate at all but like this it seems to be correct - at least from my (very limited) viewpoint. |
You're right, this should be a linked list. The only reason I went with the reallocation approach is that I wanted to touch as little code as possible to avoid new bugs. What I have now is a compromise: it leaves the old logic (allocate all list elements at creation, some of them are just invalid) untouched but it's still a list so append doesn't have to reallocate and fix pointers. I chose a TAILQ because the code assumes that the list is ordered so append has to insert at the tail.
I don't think changing sc_lock makes sense since it (AFAICS) has the very specific purpose of protecting cross-part-boundary IOs (should rarely happen anyways) which is completely orthogonal to the append functionality.
I think everything in your first review should be addressed - except for the lock order of course, I still need help with that.
The backstory is rather simple, I don't want my server to go offline just because I'm growing the filesystem. Ever since Linux LVM showed me how easy partition management can be I hate working without this kind of flexibility and I was really happy to see that FreeBSD can replicate basically all of LVM's core features with the right combination of geoms - with just a few exceptions, online append being one of them.
I guess gvinum was at some point intended to solve all of this but after hitting two kernel panics during basic tests I gave up on using it. The bugs I filed still don't have a single response so I guess it's just old, unmaintained and dead.
sys/geom/concat/g_concat.c | ||
---|---|---|
251 | I think not returning here was a bug? |
Feel free to break the change into multiple patches. The change to use a TAILQ could certainly be committed on its own as a "no functional change" diff.
What I have now is a compromise: it leaves the old logic (allocate all list elements at creation, some of them are just invalid) untouched but it's still a list so append doesn't have to reallocate and fix pointers. I chose a TAILQ because the code assumes that the list is ordered so append has to insert at the tail.
I don't think changing sc_lock makes sense since it (AFAICS) has the very specific purpose of protecting cross-part-boundary IOs (should rarely happen anyways) which is completely orthogonal to the append functionality.
Ok. I need to review the existing locking in gconcat.
I think everything in your first review should be addressed - except for the lock order of course, I still need help with that.
The backstory is rather simple, I don't want my server to go offline just because I'm growing the filesystem. Ever since Linux LVM showed me how easy partition management can be I hate working without this kind of flexibility and I was really happy to see that FreeBSD can replicate basically all of LVM's core features with the right combination of geoms - with just a few exceptions, online append being one of them.
I guess gvinum was at some point intended to solve all of this but after hitting two kernel panics during basic tests I gave up on using it. The bugs I filed still don't have a single response so I guess it's just old, unmaintained and dead.
Yeah, gvinum is effectively dead AFAIK. I have not seen any work done on it in a long time.
sys/geom/concat/g_concat.c | ||
---|---|---|
251 | Yes, I'll go ahead and commit that now. Thanks. If you're interested in using gconcat as a kernel dump device, I found another bug some time ago :( https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=225120 | |
475 | You can get rid of this variable and just test disk == TAILQ_FIRST(&sc->sc_disks) above. |
sys/geom/concat/g_concat.c | ||
---|---|---|
541 | Style: the opening brace should be on the previous line. | |
547 | disk can be initialized in the for loop header. | |
553 | Style: the opening brace should be on the previous line. | |
583 | One issue is that the disk pointer may have been invalidated while the lock was dropped. That said, I don't think you need to look up disk until after this point? | |
664 | Might as well just delete this line. | |
718 | You can avoid the temp variable and write: while ((disk = TAILQ_FIRST(&sc->sc_disks)) != NULL) { TAILQ_REMOVE(&sc->sc_disks, disk, d_next); free(disk, M_CONCAT); } | |
1049 | Style: continuation lines should be indented by four spaces. | |
1180 | sc->sc_ndisks++; | |
sys/geom/concat/g_concat.h | ||
93 | Let's rename sc_lock to sc_completion_lock and sc_lock_append to sc_disks_lock. |
Feel free to break the change into multiple patches. The change to use a TAILQ could certainly be committed on its own as a "no functional change" diff.
Not sure how to do this? My current workflow is copy-pasting the output of svnlite diff into phabricator. I'm not aware of a convenient way to split up diffs like this.
sys/geom/concat/g_concat.c | ||
---|---|---|
583 | Why though? List elements are never removed so the pointer should remain valid? |
git makes it much easier to manage patch sets. Many developers use the git mirror at github.com/freebsd/freebsd. With svn it's hard to do anything other than create monolithic patches.
Anyway, I don't have any strong preferences one way or the other. My point is just that your changes provide a fairly major extension in functionality to gconcat, so I don't think you should attempt to minimize the diff to touch as few areas as possible.
sys/geom/concat/g_concat.c | ||
---|---|---|
583 | Fair enough, though that invariant is not actually enforced anywhere and could become false in the future. Let's just leave the locking here as it is in the patch for now. I am mostly worried about corner-case race conditions that typically arise when code drops a lock in order to avoid a lock order reversal. There has been a number of subtle bugs in gmirror that arose this way, so I tend to be suspicious of that pattern. |
What is the status of this? Do you need me to split the changes? Would I need to create an entirely separate phabricator diff then?
This moved to https://github.com/freebsd/freebsd-src/pull/447 and I've committed the locking change and the list change. I believe that the locking order is good, but if I've moved too quickly, please let me know.