Switch mount.conf(8) support to use functions in md(4).
Needs ReviewPublic
Actions

Authored by brooks on Mar 17 2018, 12:19 AM.

Details

Reviewers

cem
kib
markj
imp
jhb

Summary

As md(4) may be a loadable module, use function pointers for access to
kern_mdattach and kern_mddetach. In general, this pattern poses a small
risk of races with unload, but that should not be a concern in the
mountroot code path.

Remove support for calling MDIOCATTACH from the kernel.

Diff Detail

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 15830
Build 15838: arc lint + arc unit

Event Timeline

brooks created this revision.Mar 17 2018, 12:19 AM

Harbormaster completed remote builds in B15607: Diff 40367.Mar 17 2018, 12:19 AM

I didn't look at parse_dir_md() yet.

Don't we already have refcounted module APIs for accessing things that could be unloaded? dev_refthread() + dsw->foo + dev_relthread() come to mind. Are those not usable here? It might not require using the ioctl() API — there are spare pointers in cdevsw that could be overloaded or something. Or is devfs off limits because we're still trying to mount root here?

You can pass the kind of request as an additional parameter, or extend md_req with the request type. Then you only need to expose one pointer.

sys/kern/vfs_mountroot.c
585–591	FYI: This conflicts with D14712

This revision is now accepted and ready to land.Mar 17 2018, 12:22 PM

markj accepted this revision.Mar 17 2018, 4:42 PM

markj added inline comments.

sys/dev/md/md.c
81	Would it be worth making a _uio.h instead? There's a number of headers in sys/sys that include sys/uio.h.

Add jhb to see if he has an opinion to the interface between the module and the kernel.

Rebase
Use sys/_uio.h to avoid the need for sys/uio.h before sys/mdioctl.h

This revision now requires review to proceed.Mar 27 2018, 5:31 PM

Harbormaster completed remote builds in B15830: Diff 40807.Mar 27 2018, 5:31 PM

Bare function pointers is a bit hackish. A couple of thoughts come to mind:

Add a global sx lock and just grab it around invocations of the function pointers (and check if the function pointer is NULL while holding the lock) and also grab it in the module load/unload routines around setting/clearing the function pointers. You probably want to reject unloading if there are active memory disks and should hold the lock while doing that check.

A variant of the above would be to use EVENTHANDLERS instead of bare function pointers. However, that doesn't close the race of checking that there aren't any active memory disks in the case that one is being added concurrent with the check.

Hmm, so it seems you can't do the check for active memory disks in unload as that is all buried in GEOM itself. One would be tempted to use the GEOM topology lock for the lock to protect the function pointers, but I suspect that will give you a LOR or recursion complaints. These issues don't exist if this stays an ioctl().

One issue with ioctls is we have no way to determine if an ioctl is invoked from the kernel or from userland. There are some cases where it would be nice to tell that apart for sure (e.g. by setting a free bit in the ioctl that userland isn't permitted to set, only there aren't any available bits).

*sigh* md(4) already has races with unload as mdinit() doen't check to see if it's been unloaded before invoking g_new_geomf() after grabbing the topology lock. You would need g_md_fini() to lock md_sx and set some sort of dying flag that kern_mdattach_locked checks first thing and bails if it is set. You would then need to not destroy the md_sx lock until after destroy_dev() of status_cdev has drained any threads out of the ioctl. Of course, this means dropping the topology lock in the fini callback to avoid LORs. If you made the function pointers EVENTHANDLERS and you deregistered them after setting 'dying' but before destroying md_sx then I think you would close the races.

sys/dev/md/md.c
2086	You need to clear these in the 'fini' routine presumably?

In D14715#312779, @jhb wrote:

One issue with ioctls is we have no way to determine if an ioctl is invoked from the kernel or from userland. There are some cases where it would be nice to tell that apart for sure (e.g. by setting a free bit in the ioctl that userland isn't permitted to set, only there aren't any available bits).

One idea did occur to me on this subject. It might be worth considering making ioctl commands be uint64_t's internally. That would let us add a kernel flag and handle kernel versions a separate commands. The result might be too disruptive though.