Changeset View
Standalone View
sys/fs/nullfs/null_vfsops.c
Show First 20 Lines • Show All 68 Lines • ▼ Show 20 Lines | |||||
static vfs_extattrctl_t nullfs_extattrctl; | static vfs_extattrctl_t nullfs_extattrctl; | ||||
/* | /* | ||||
* Mount null layer | * Mount null layer | ||||
*/ | */ | ||||
static int | static int | ||||
nullfs_mount(struct mount *mp) | nullfs_mount(struct mount *mp) | ||||
{ | { | ||||
struct mount *lowermp; | |||||
struct vnode *lowerrootvp; | struct vnode *lowerrootvp; | ||||
struct vnode *nullm_rootvp; | struct vnode *nullm_rootvp; | ||||
struct null_mount *xmp; | struct null_mount *xmp; | ||||
struct null_node *nn; | struct null_node *nn; | ||||
struct nameidata nd, *ndp; | struct nameidata nd, *ndp; | ||||
char *target; | char *target; | ||||
int error, len; | int error, len; | ||||
bool isvnunlocked; | bool isvnunlocked; | ||||
▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines | if (nn == NULL || lowerrootvp == nn->null_lowervp) { | ||||
return (EDEADLK); | return (EDEADLK); | ||||
} | } | ||||
} | } | ||||
xmp = (struct null_mount *) malloc(sizeof(struct null_mount), | xmp = (struct null_mount *) malloc(sizeof(struct null_mount), | ||||
M_NULLFSMNT, M_WAITOK | M_ZERO); | M_NULLFSMNT, M_WAITOK | M_ZERO); | ||||
/* | /* | ||||
* Save pointer to underlying FS and the reference to the | * Save the reference to the lower root vnode. | ||||
* lower root vnode. | |||||
*/ | */ | ||||
xmp->nullm_vfs = lowerrootvp->v_mount; | |||||
vref(lowerrootvp); | vref(lowerrootvp); | ||||
lowermp = nullfs_mount_trybusy(lowerrootvp); | |||||
kib: I do not think this is useful, because the lowerrootvp vnode is locked, so unmount cannot… | |||||
jahAuthorUnsubmitted Done Inline ActionsOk. For the other VFS operations besides mount, would it be preferable to drop MBF_NOWAIT and allow vfs_busy() to block? jah: Ok. For the other VFS operations besides mount, would it be preferable to drop MBF_NOWAIT and… | |||||
jahAuthorUnsubmitted Done Inline ActionsSpecifically the concern I had regarding deadlock would be if there could ever be a case in which the lower mount would be busied on the call here, then an umount would happen on another thread, followed by our attempt to busy the mount. So basically the same thing that led to unbusying requirements around quota file I/O in VFS_QUOTACTL(). The same concern would apply both here and in the corresponding unionfs change. But for both nullfs and unionfs, would it instead be better to simply prevent even forced unmount of any lower FS while the upper is mounted? mnt_uppers wouldn't work as-is for unionfs, and it seems to imply some extra functionality anyway, but perhaps something like a new mnt_kern_flag ? jah: Specifically the concern I had regarding deadlock would be if there could ever be a case in… | |||||
jahAuthorUnsubmitted Done Inline Actions(Or rather, a mnt_upper_count or mnt_hold_count field in struct mount, since a mount can have multiple upper mounts) jah: (Or rather, a mnt_upper_count or mnt_hold_count field in struct mount, since a mount can have… | |||||
kibUnsubmitted Not Done Inline ActionsSo why not register nullfs upper mounts unconditionally? Of course, VFS_NOTIFY_UPPER_XXX callbacks would do nothing. That said, I do not think that any busying is needed for nullfs_mount, because liveness of the lowervp provides the strongest guarantee. kib: So why not register nullfs upper mounts unconditionally? Of course, VFS_NOTIFY_UPPER_XXX… | |||||
jahAuthorUnsubmitted Done Inline ActionsI agree, busying shouldn't be needed for nullfs_mount(). I'm more concerned with the other vfs_* entrypoints, and also the corresponding entrypoints in unionfs. I'm hesitant to use mnt_uppers to keep the lower mount from going away for two reasons:
jah: I agree, busying shouldn't be needed for nullfs_mount(). I'm more concerned with the other… | |||||
kibUnsubmitted Not Done Inline ActionsI added the check for TAILQ_EMPTY(&mp->mnt_uppers) from the moment when the list was introduced. AFAIR, the intent was to prevent panics that plagued the portbuild cluster when people unmounted lower forgetting about nullfs mounts. I believe it is simpler and more fit the nullfs code. For unionfs, if this approach is not adequate, lets develop something else. Busy is tricky because
kib: I added the check for TAILQ_EMPTY(&mp->mnt_uppers) from the moment when the list was introduced. | |||||
jahAuthorUnsubmitted Done Inline ActionsWhy not use my earlier suggestion to add a counter to struct mount? The changes here and in D30152 would be replaced with a much simpler set of changes that would:
This would work for both nullfs and unionfs, and wouldn't cause any unnecessary VFS_NOTIFY_UPPER_* callbacks in the nocache case for nullfs. What it wouldn't do is allow unionfs to participate in the VFS_NOTIFY_UPPER_ * scheme, but I think unionfs has bigger issues to be fixed before worrying about that. jah: Why not use my earlier suggestion to add a counter to struct mount?
The changes here and in… | |||||
kibUnsubmitted Not Done Inline ActionsBut would the increment of mnt_pinned_count blocked if unmount is already started? If yes, it is yet another busy count, if not, what prevents pinned to go from 0 to 1 after unmount drained it? Intuitively, we have a lot of draining counters already, busy is drained before unmount starts, and mnt_ref after unmount but before struct mount can be reused, and write_count is drained by some VFS_UNMOUNT implementations after busy but before they start to tramp down the unwritten changes. So, again intuitively, it seems excessive to add yet another counter, since it would duplicate some existing one. [Let's put unionfs aside for a moment] BTW, for purpose of returning EBUSY on unmount, we do not need to take the mount interlock. We also check mnt_uppers unlocked first. kib: But would the increment of mnt_pinned_count blocked if unmount is already started? If yes, it… | |||||
jahAuthorUnsubmitted Done Inline ActionsI would not want the increment to block. I would want it to fail if unmount is already started, that is if lowermp->mnt_kern_flag & MNTK_UNMOUNT != 0, also checked within the same ilock section as the increment of mnt_pinned_count. Since the increment only happens on mount of the upper filesystem, the mount attempt shouldn't be allowed if there is already a pending unmount request, even if that request doesn't ultimately succeed. In dounmount(), mnt_uppers is checked while holding the ilock, just before setting MNTK_UNMOUNT. dounmount_cleanup() drops the ilock if we decide to return EBUSY. jah: I would not want the increment to block. I would want it to fail if unmount is already started… | |||||
kibUnsubmitted Not Done Inline ActionsOk, lets go with this direction. kib: Ok, lets go with this direction. | |||||
xmp->nullm_lowerrootvp = lowerrootvp; | xmp->nullm_lowerrootvp = lowerrootvp; | ||||
mp->mnt_data = xmp; | mp->mnt_data = xmp; | ||||
/* | /* | ||||
* Make sure the node alias worked. | * Make sure the node alias worked. | ||||
*/ | */ | ||||
if (lowermp == NULL) | |||||
error = ENOENT; | |||||
else | |||||
error = null_nodeget(mp, lowerrootvp, &nullm_rootvp); | error = null_nodeget(mp, lowerrootvp, &nullm_rootvp); | ||||
if (error != 0) { | if (error != 0) { | ||||
nullfs_mount_unbusy(lowermp); | |||||
vrele(lowerrootvp); | vrele(lowerrootvp); | ||||
free(xmp, M_NULLFSMNT); | free(xmp, M_NULLFSMNT); | ||||
return (error); | return (error); | ||||
} | } | ||||
if (NULLVPTOLOWERVP(nullm_rootvp)->v_mount->mnt_flag & MNT_LOCAL) { | if (NULLVPTOLOWERVP(nullm_rootvp)->v_mount->mnt_flag & MNT_LOCAL) { | ||||
MNT_ILOCK(mp); | MNT_ILOCK(mp); | ||||
mp->mnt_flag |= MNT_LOCAL; | mp->mnt_flag |= MNT_LOCAL; | ||||
MNT_IUNLOCK(mp); | MNT_IUNLOCK(mp); | ||||
} | } | ||||
xmp->nullm_flags |= NULLM_CACHE; | xmp->nullm_flags |= NULLM_CACHE; | ||||
if (vfs_getopt(mp->mnt_optnew, "nocache", NULL, NULL) == 0 || | if (vfs_getopt(mp->mnt_optnew, "nocache", NULL, NULL) == 0 || | ||||
(xmp->nullm_vfs->mnt_kern_flag & MNTK_NULL_NOCACHE) != 0) | (lowermp->mnt_kern_flag & MNTK_NULL_NOCACHE) != 0) | ||||
xmp->nullm_flags &= ~NULLM_CACHE; | xmp->nullm_flags &= ~NULLM_CACHE; | ||||
MNT_ILOCK(mp); | MNT_ILOCK(mp); | ||||
if ((xmp->nullm_flags & NULLM_CACHE) != 0) { | if ((xmp->nullm_flags & NULLM_CACHE) != 0) { | ||||
mp->mnt_kern_flag |= lowerrootvp->v_mount->mnt_kern_flag & | mp->mnt_kern_flag |= lowerrootvp->v_mount->mnt_kern_flag & | ||||
(MNTK_SHARED_WRITES | MNTK_LOOKUP_SHARED | | (MNTK_SHARED_WRITES | MNTK_LOOKUP_SHARED | | ||||
MNTK_EXTENDED_SHARED); | MNTK_EXTENDED_SHARED); | ||||
} | } | ||||
mp->mnt_kern_flag |= MNTK_LOOKUP_EXCL_DOTDOT | MNTK_NOMSYNC; | mp->mnt_kern_flag |= MNTK_LOOKUP_EXCL_DOTDOT | MNTK_NOMSYNC; | ||||
mp->mnt_kern_flag |= lowerrootvp->v_mount->mnt_kern_flag & | mp->mnt_kern_flag |= lowerrootvp->v_mount->mnt_kern_flag & | ||||
(MNTK_USES_BCACHE | MNTK_NO_IOPF | MNTK_UNMAPPED_BUFS); | (MNTK_USES_BCACHE | MNTK_NO_IOPF | MNTK_UNMAPPED_BUFS); | ||||
MNT_IUNLOCK(mp); | MNT_IUNLOCK(mp); | ||||
vfs_getnewfsid(mp); | vfs_getnewfsid(mp); | ||||
if ((xmp->nullm_flags & NULLM_CACHE) != 0) { | if ((xmp->nullm_flags & NULLM_CACHE) != 0) { | ||||
MNT_ILOCK(xmp->nullm_vfs); | MNT_ILOCK(lowermp); | ||||
TAILQ_INSERT_TAIL(&xmp->nullm_vfs->mnt_uppers, mp, | TAILQ_INSERT_TAIL(&lowermp->mnt_uppers, mp, | ||||
mnt_upper_link); | mnt_upper_link); | ||||
MNT_IUNLOCK(xmp->nullm_vfs); | MNT_IUNLOCK(lowermp); | ||||
} | } | ||||
nullfs_mount_unbusy(lowermp); | |||||
vfs_mountedfrom(mp, target); | vfs_mountedfrom(mp, target); | ||||
vput(nullm_rootvp); | vput(nullm_rootvp); | ||||
NULLFSDEBUG("nullfs_mount: lower %s, alias at %s\n", | NULLFSDEBUG("nullfs_mount: lower %s, alias at %s\n", | ||||
mp->mnt_stat.f_mntfromname, mp->mnt_stat.f_mntonname); | mp->mnt_stat.f_mntfromname, mp->mnt_stat.f_mntonname); | ||||
return (0); | return (0); | ||||
} | } | ||||
Show All 30 Lines | for (;;) { | ||||
if ((mntflags & MNT_FORCE) == 0) | if ((mntflags & MNT_FORCE) == 0) | ||||
return (EBUSY); | return (EBUSY); | ||||
} | } | ||||
/* | /* | ||||
* Finally, throw away the null_mount structure | * Finally, throw away the null_mount structure | ||||
*/ | */ | ||||
mntdata = mp->mnt_data; | mntdata = mp->mnt_data; | ||||
ump = mntdata->nullm_vfs; | |||||
if ((mntdata->nullm_flags & NULLM_CACHE) != 0) { | if ((mntdata->nullm_flags & NULLM_CACHE) != 0) { | ||||
ump = mntdata->nullm_lowerrootvp->v_mount; | |||||
/* | |||||
* Registration as upper should prevent even forced | |||||
* unmount of lower FS. | |||||
*/ | |||||
KASSERT(ump != NULL, ("nullfs: lower mount gone")); | |||||
MNT_ILOCK(ump); | MNT_ILOCK(ump); | ||||
while ((ump->mnt_kern_flag & MNTK_VGONE_UPPER) != 0) { | while ((ump->mnt_kern_flag & MNTK_VGONE_UPPER) != 0) { | ||||
ump->mnt_kern_flag |= MNTK_VGONE_WAITER; | ump->mnt_kern_flag |= MNTK_VGONE_WAITER; | ||||
msleep(&ump->mnt_uppers, &ump->mnt_mtx, 0, "vgnupw", 0); | msleep(&ump->mnt_uppers, &ump->mnt_mtx, 0, "vgnupw", 0); | ||||
} | } | ||||
TAILQ_REMOVE(&ump->mnt_uppers, mp, mnt_upper_link); | TAILQ_REMOVE(&ump->mnt_uppers, mp, mnt_upper_link); | ||||
MNT_IUNLOCK(ump); | MNT_IUNLOCK(ump); | ||||
} | } | ||||
Show All 23 Lines | if (error == 0) { | ||||
if (error == 0) { | if (error == 0) { | ||||
*vpp = vp; | *vpp = vp; | ||||
} | } | ||||
} | } | ||||
return (error); | return (error); | ||||
} | } | ||||
static int | static int | ||||
nullfs_quotactl(mp, cmd, uid, arg) | nullfs_quotactl(mp, cmd, uid, arg, mp_busy) | ||||
struct mount *mp; | struct mount *mp; | ||||
int cmd; | int cmd; | ||||
uid_t uid; | uid_t uid; | ||||
void *arg; | void *arg; | ||||
bool *mp_busy; | |||||
{ | { | ||||
return VFS_QUOTACTL(MOUNTTONULLMOUNT(mp)->nullm_vfs, cmd, uid, arg); | struct mount *lowermp; | ||||
struct null_mount *mntdata; | |||||
int error; | |||||
bool unbusy; | |||||
unbusy = true; | |||||
mntdata = MOUNTTONULLMOUNT(mp); | |||||
NULLFSDEBUG("nullfs_quotactl(mp = %p, vp = %p)\n", mp, | |||||
mntdata->nullm_lowerrootvp); | |||||
lowermp = nullfs_mount_trybusy(mntdata->nullm_lowerrootvp); | |||||
if (lowermp == NULL) | |||||
return (ENOENT); | |||||
error = VFS_QUOTACTL(lowermp, cmd, uid, arg, &unbusy); | |||||
if (unbusy) | |||||
nullfs_mount_unbusy(lowermp); | |||||
return (error); | |||||
} | } | ||||
static int | static int | ||||
nullfs_statfs(mp, sbp) | nullfs_statfs(mp, sbp) | ||||
struct mount *mp; | struct mount *mp; | ||||
struct statfs *sbp; | struct statfs *sbp; | ||||
{ | { | ||||
struct mount *lowermp; | |||||
struct null_mount *mntdata; | |||||
int error; | int error; | ||||
struct statfs *mstat; | struct statfs *mstat; | ||||
NULLFSDEBUG("nullfs_statfs(mp = %p, vp = %p->%p)\n", (void *)mp, | mntdata = MOUNTTONULLMOUNT(mp); | ||||
(void *)MOUNTTONULLMOUNT(mp)->nullm_rootvp, | NULLFSDEBUG("nullfs_statfs(mp = %p, vp = %p->%p)\n", mp, | ||||
(void *)NULLVPTOLOWERVP(MOUNTTONULLMOUNT(mp)->nullm_rootvp)); | mntdata->nullm_rootvp, NULLVPTOLOWERVP(mntdata->nullm_rootvp)); | ||||
mstat = malloc(sizeof(struct statfs), M_STATFS, M_WAITOK | M_ZERO); | mstat = malloc(sizeof(struct statfs), M_STATFS, M_WAITOK | M_ZERO); | ||||
error = VFS_STATFS(MOUNTTONULLMOUNT(mp)->nullm_vfs, mstat); | lowermp = nullfs_mount_trybusy(mntdata->nullm_lowerrootvp); | ||||
if (lowermp == NULL) | |||||
error = ENOENT; | |||||
else | |||||
error = VFS_STATFS(lowermp, mstat); | |||||
nullfs_mount_unbusy(lowermp); | |||||
if (error) { | if (error) { | ||||
free(mstat, M_STATFS); | free(mstat, M_STATFS); | ||||
return (error); | return (error); | ||||
} | } | ||||
/* now copy across the "interesting" information and fake the rest */ | /* now copy across the "interesting" information and fake the rest */ | ||||
sbp->f_type = mstat->f_type; | sbp->f_type = mstat->f_type; | ||||
sbp->f_flags = (sbp->f_flags & (MNT_RDONLY | MNT_NOEXEC | MNT_NOSUID | | sbp->f_flags = (sbp->f_flags & (MNT_RDONLY | MNT_NOEXEC | MNT_NOSUID | | ||||
Show All 24 Lines | |||||
static int | static int | ||||
nullfs_vget(mp, ino, flags, vpp) | nullfs_vget(mp, ino, flags, vpp) | ||||
struct mount *mp; | struct mount *mp; | ||||
ino_t ino; | ino_t ino; | ||||
int flags; | int flags; | ||||
struct vnode **vpp; | struct vnode **vpp; | ||||
{ | { | ||||
struct mount *lowermp; | |||||
struct null_mount *mntdata; | |||||
int error; | int error; | ||||
mntdata = MOUNTTONULLMOUNT(mp); | |||||
KASSERT((flags & LK_TYPE_MASK) != 0, | KASSERT((flags & LK_TYPE_MASK) != 0, | ||||
("nullfs_vget: no lock requested")); | ("nullfs_vget: no lock requested")); | ||||
error = VFS_VGET(MOUNTTONULLMOUNT(mp)->nullm_vfs, ino, flags, vpp); | lowermp = nullfs_mount_trybusy(mntdata->nullm_lowerrootvp); | ||||
kibUnsubmitted Not Done Inline ActionsAgain, is it needed? Caller of VFS_VGET() must ensure that mp is live for the duration of the call. Then, since nullfs is registered in mnt_uppers of lowermp, it should prevent lowermp' unmount from even starting. kib: Again, is it needed?
Caller of VFS_VGET() must ensure that mp is live for the duration of the… | |||||
jahAuthorUnsubmitted Done Inline ActionsRegistration in mnt_uppers won't happen in the nocache case. jah: Registration in mnt_uppers won't happen in the nocache case. | |||||
if (lowermp == NULL) | |||||
return (ENOENT); | |||||
error = VFS_VGET(lowermp, ino, flags, vpp); | |||||
nullfs_mount_unbusy(lowermp); | |||||
if (error != 0) | if (error != 0) | ||||
return (error); | return (error); | ||||
return (null_nodeget(mp, *vpp, vpp)); | return (null_nodeget(mp, *vpp, vpp)); | ||||
} | } | ||||
static int | static int | ||||
nullfs_fhtovp(mp, fidp, flags, vpp) | nullfs_fhtovp(mp, fidp, flags, vpp) | ||||
struct mount *mp; | struct mount *mp; | ||||
struct fid *fidp; | struct fid *fidp; | ||||
int flags; | int flags; | ||||
struct vnode **vpp; | struct vnode **vpp; | ||||
{ | { | ||||
struct mount *lowermp; | |||||
struct null_mount *mntdata; | |||||
int error; | int error; | ||||
error = VFS_FHTOVP(MOUNTTONULLMOUNT(mp)->nullm_vfs, fidp, flags, | mntdata = MOUNTTONULLMOUNT(mp); | ||||
lowermp = nullfs_mount_trybusy(mntdata->nullm_lowerrootvp); | |||||
if (lowermp == NULL) | |||||
return (ENOENT); | |||||
error = VFS_FHTOVP(lowermp, fidp, flags, | |||||
vpp); | vpp); | ||||
nullfs_mount_unbusy(lowermp); | |||||
if (error != 0) | if (error != 0) | ||||
return (error); | return (error); | ||||
return (null_nodeget(mp, *vpp, vpp)); | return (null_nodeget(mp, *vpp, vpp)); | ||||
} | } | ||||
static int | static int | ||||
nullfs_extattrctl(mp, cmd, filename_vp, namespace, attrname) | nullfs_extattrctl(mp, cmd, filename_vp, namespace, attrname) | ||||
struct mount *mp; | struct mount *mp; | ||||
int cmd; | int cmd; | ||||
struct vnode *filename_vp; | struct vnode *filename_vp; | ||||
int namespace; | int namespace; | ||||
const char *attrname; | const char *attrname; | ||||
{ | { | ||||
struct mount *lowermp; | |||||
struct null_mount *mntdata; | |||||
int error; | |||||
return (VFS_EXTATTRCTL(MOUNTTONULLMOUNT(mp)->nullm_vfs, cmd, | mntdata = MOUNTTONULLMOUNT(mp); | ||||
filename_vp, namespace, attrname)); | |||||
lowermp = nullfs_mount_trybusy(mntdata->nullm_lowerrootvp); | |||||
if (lowermp == NULL) | |||||
return (ENOENT); | |||||
error = VFS_EXTATTRCTL(lowermp, cmd, | |||||
filename_vp, namespace, attrname); | |||||
nullfs_mount_unbusy(lowermp); | |||||
return (error); | |||||
} | } | ||||
static void | static void | ||||
nullfs_reclaim_lowervp(struct mount *mp, struct vnode *lowervp) | nullfs_reclaim_lowervp(struct mount *mp, struct vnode *lowervp) | ||||
{ | { | ||||
struct vnode *vp; | struct vnode *vp; | ||||
vp = null_hashget(mp, lowervp); | vp = null_hashget(mp, lowervp); | ||||
▲ Show 20 Lines • Show All 64 Lines • Show Last 20 Lines |
I do not think this is useful, because the lowerrootvp vnode is locked, so unmount cannot proceed past vflush().
In fact, if not MBF_NOWAIT used in nullfs_mount_trybusy(), then this would be a LoR. But I am not sure that reporting transient failures from MBF_NOWAIT is the right choice. Idea with the non-forced unmount is that it should be invisible for the userspace. E.g. automount does periodic probes of its filesystems with non-forced unmount, and userspace should not see transient errors due to probing.