Page MenuHomeFreeBSD

Make root mount wait smarter.
ClosedPublic

Authored by trasz on Sep 22 2015, 12:39 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Jan 9, 4:38 PM
Unknown Object (File)
Sun, Jan 5, 3:49 PM
Unknown Object (File)
Fri, Jan 3, 5:54 PM
Unknown Object (File)
Dec 4 2024, 11:23 AM
Unknown Object (File)
Nov 27 2024, 12:56 AM
Unknown Object (File)
Nov 25 2024, 8:06 PM
Unknown Object (File)
Nov 25 2024, 4:26 PM
Unknown Object (File)
Nov 23 2024, 4:17 AM
Subscribers

Details

Summary

Make root mount wait mechanism smarter, by making it wait only if it's actually
neccessary to find the root device.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 580
Build 580: arc lint + arc unit

Event Timeline

trasz retitled this revision from to Make root mount wait smarter..
trasz updated this object.
trasz edited the test plan for this revision. (Show Details)
sys/kern/vfs_mountroot.c
687–688

This looks like it's more for debugging than for an end user. Maybe put this under boot verbose? The reason is that it's not an error until it's an error, meaning that if we're going to wait for the device when it's not there yet, then printing errors is confusing. Putting it under boot verbose is a nice way for people to get insight into why a boot may be stalled...

Remove debugging printf.

trasz added inline comments.
sys/kern/vfs_mountroot.c
689–690

Actually, it was a debugging printf that snuck in. Removed.

sys/kern/vfs_mountroot.c
947

Please come with much shorter name for the function.

967

I think this is not needed, and probably even dangerous, for the nfs case. I have no better suggestion than ugly strstr("nfs") hack, but it is still much better than preventing nfs mount due to the issues with the local storage subsystem.

987

This blank line is not needed.

trasz added inline comments.
sys/kern/vfs_mountroot.c
967

For the non-NFS case it's definitely needed, because without it, partitions don't get created before we check for the device presence, and we end up doing the full wait in all cases, defying the purpose of this patch.

For the NFS case, this code is not reached - just like for ZFS, we call vfs_mountroot_wait() and return few lines earlier.

marcel edited edge metadata.
This revision is now accepted and ready to land.Sep 22 2015, 2:57 PM
kib edited edge metadata.
trasz requested a review of this revision.Oct 1 2015, 12:27 PM
trasz added a reviewer: mav.

Alexander, just to be extra safe, do you perhaps see a scenario - like with graid - where this could break stuff?

Is this code used for all boot mounting, or only for root FS? If only root, then I worry about case when root is single disk, available immediately, while there is some other FS on graid, which incompleteness forces delay for 30 seconds before device will be registered. Won't other FS mounting fail in this case?

That's a good question. Yeah, I think things might break in this case. Do you have any idea how to fix it?

One idea would be to put this vfs_mountroot_wait_if_neccessary() into vfs_donmount(). It should work, but it's ugly. Another would be to add a "root mount hold counter" and export it via sysctl, and then in the mountcritlocal script do the "mount -a", as it's done now, and if it fails, wait for the sysctl to go to zero and retry.

I know very little about all this mounting code, so hardly I am an authority here. But I think that "root mount hold" should be waited in any case, the only question is how late can we postpone it. Some applications may access raw devices without mounting, and for them it would be better if device detection process was really completed before their start. I think it should be safe to proceed with root (or other critical FS) mounting if we have required devices, but at some point before starting random daemons we still may prefer to stop and wait for the rest of holds.

Instead of hold counter it could be better to export list of holds, as it is printed now on console. It is not always informative, but at least something.

trasz edited edge metadata.

Optionally wait for root release in mountcritlocal.

Hm, I like the list idea. I'll try to rework the patch.

I would not rework the patch. While mav@'s comments are good, they go beyond the simple problem that the hold solved. The hold only solved the holding off of mounting root while interrupt-driven discovery was still ongoing. You patch, trasz@, avoids that we wait mounting root when there was no point waiting, as the device was already there. That is a good change.

User space could never assume that devices are there when root gets mounted, though I'm sure this assumption has snuck into our own code that gets run from rc.d. The user space problem isn't solved by making lists of devices that can hold the root mount -- that's pretty much any block device. Such a list does not solve the fundamental problem that's hot plug.

GEOM in particular is very nasty to work with when it comes to creating partitions and suddenly having all kinds of gems appear due to pre-existing data.

What I'm thinking off is an interface that allows use space to wait for the kernel to become silent with respect to block device discovery and/or GEOM tasting. This interface can be used in the various mount scripts under /etc/rc.d as barriers.

Anyway: I don't think reworking the patch is a good idea, irrespective what else we may want to do.

FYI I have been testing this patch with success in our testlab

Make the sysctl return an actual list.

Marcel, I agree a better interface is needed, eventually. However, this change without the mountcritlocal part can break people's setups, and that's something I'd like to avoid.

That said, I think it would be better to commit it in two parts, first the already approved fix, and then the new sysctl and rc scripts.

This revision was automatically updated to reflect the committed changes.