commit 480ed6b1d254ef459f649a1a8422b81d8307ab5e Author: Mateusz Guzik <mjg@FreeBSD.org> Date: Mon Apr 24 16:46:27 2023 +0000 ufs: drop LK_SLEEPFAIL from ffs_sync It was added to handle identitity changes, but these no longer happen since vnodes became actually freeable. Even so, the change was bogus in that restarts could in principle keep happening. Note the issue is not fully fixed yet, but this gets rid of the only consumer of the flag to VOP_LOCK. commit ac68c238252dbc76e1875e4ed9ecdb13e8241a3e Author: Mateusz Guzik <mjg@FreeBSD.org> Date: Tue Sep 20 22:11:02 2022 +0000 vfs: retire LK_INTERLOCK support from VOP_LOCK Passing the lock down buys nothing, but instead requires locking routines to be aware of its existence. It gets dropped immediately if the vnode lock is already held, meaning it does not protect the vnode from going away -- this is typically handled by having the caller vhold the vnode first. It does not prevent other threads from acquiring the vnode lock either as there are plenty of places which do it without acquiring the interlock first. commit f4bd5f647b9dedd07e929d254d8838ad22456520 Author: Mateusz Guzik <mjg@FreeBSD.org> Date: Tue Sep 20 22:09:14 2022 +0000 vfs: remove LK_INTERLOCK support from vget_finish Note it is kept for vget. commit b4442f21867525e7262b20249d3be9bfbf56d16d Author: Mateusz Guzik <mjg@FreeBSD.org> Date: Tue Sep 20 21:26:49 2022 +0000 vfs: stop passing LK_INTERLOCK to VOP_LOCK Note this does not remove the support yet. commit c18cd53852eb5f4886e5d0d5efefcd15f9ffc0fa Author: Mateusz Guzik <mjg@FreeBSD.org> Date: Tue Sep 20 20:53:41 2022 +0000 ufs: stop passing LK_INTERLOCK to vn_lock The feature is going away.
Details
will ask pho
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
What I expected to see in the summary, instead of the claim that the feature is going away, an explanation why the cases are still correct without using interlock.
sys/kern/vfs_default.c | ||
---|---|---|
528–529 | It is rather pointless to keep the local after VI_MTX is no longer calculated | |
sys/kern/vfs_subr.c | ||
3315–3316 | VI_UNLOCK/VI_LOCK are not needed | |
3320–3321 | From what I remember about LK_TRYUPGRADE, relocking interlock is not needed there as well. | |
3914 | interlock is not needed at all |
Consider LK_SLEEPFAIL | LK_INTERLOCK. If it returned success, then we can be sure that the VIRF_DOOMED was not set during the lock. I believe your patch breaks the guarantee.
It seems that the property is used at least by ffs snapshots. I was sure that it is used by SU as well, but it seems that SU uses this pattern only on buffers.
I don't remember if I mentioned, but the goal is to make the interface more similar to regular locks, so that something like sx can be employed as a vnode lock. Consequently LK_SLEEPFAIL also needs to go.
You are right about the change as is, but I think this is fixable.
The only use I found is in ffs_sync, where I'm confident it is bogus to begin with.
According the commit which introduced it:
commit 41d4783d494514df11aaeea38deba16ea3848606 Author: Jeff Roberson <jeff@FreeBSD.org> Date: Sun Apr 3 10:38:18 2005 +0000 - In ffs_sync we need to pass LK_SLEEPFAIL in when we lock the vnode because it may change identities while we're sleeping on the lock. Otherwise we may bail out of ffs_sync() early due to an error from deadfs.
But if the vnode transitions to VBAD, there is nothing to do and the main loop can continue. Instead, the code restarts iterating from scratch. Similarly, if the code is allowed to sleep waiting on the lock, why restart squat if the vnode did not get doomed? This is a huge problem as a sufficiently unlucky locking pattern can turn this into an indefinite loop in principle. So how about this in concept (untested):
diff --git a/sys/ufs/ffs/ffs_vfsops.c b/sys/ufs/ffs/ffs_vfsops.c index 8f6e186b44b2..36c49e51e432 100644 --- a/sys/ufs/ffs/ffs_vfsops.c +++ b/sys/ufs/ffs/ffs_vfsops.c @@ -1713,7 +1713,7 @@ ffs_sync(struct mount *mp, int waitfor) } if (waitfor == MNT_WAIT) lockreq = LK_EXCLUSIVE; - lockreq |= LK_INTERLOCK | LK_SLEEPFAIL; + lockreq |= LK_INTERLOCK | LK_RETRY; loop: /* Grab snapshot of secondary write counts */ MNT_ILOCK(mp); @@ -1742,11 +1742,9 @@ ffs_sync(struct mount *mp, int waitfor) VI_UNLOCK(vp); continue; } - if ((error = vget(vp, lockreq)) != 0) { - if (error == ENOENT || error == ENOLCK) { - MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp); - goto loop; - } + vget(vp, lockreq); + if (VN_IS_DOOMED(vp)) + vput(vp); continue; } #ifdef QUOTA
I whacked LK_SLEEPFAIL. As of this patchset, it is no longer legal to LK_INTERLOCK or LK_SLPEEFAIL to VOP_LOCK.
I don't think this regresses anything that I can see.
Your DOOMED state concern is already problem in the unpatched state, for example:
struct vnode * __mnt_vnode_first_all(struct vnode **mvp, struct mount *mp) { struct vnode *vp; *mvp = vn_alloc_marker(mp); MNT_ILOCK(mp); MNT_REF(mp); TAILQ_FOREACH(vp, &mp->mnt_nvnodelist, v_nmntvnodes) { /* Allow a racy peek at VIRF_DOOMED to save a lock acquisition. */ if (vp->v_type == VMARKER || VN_IS_DOOMED(vp)) continue; VI_LOCK(vp); if (VN_IS_DOOMED(vp)) { VI_UNLOCK(vp); continue; } break; }
as in this skips the doomed vnodes on its own without any waits for state to settle.
I'm going to fix it, but not in time for 14.0
So why is the interlock support kept for vget()? I do not see it different from e.g. just doing the same VI_UNLOCK()/clearing LK_INTERLOCK in vn_lock() as you do in vget(), instead of the pass over all tree.
sys/fs/unionfs/union_vnops.c | ||
---|---|---|
1993 | Why you removed MTX_DUPOK? | |
sys/kern/vfs_subr.c | ||
5669 | ||
sys/ufs/ffs/ffs_vfsops.c | ||
1743 | Why the interlock was kept there? | |
1743 | I forget to remove this comment when I wrote a more generic form in the global discussion. | |
sys/ufs/ffs/ffs_vnops.c | ||
531 |