remove support for LK_INTERLOCK from VOP_LOCK
AbandonedPublic
Actions

Authored by mjg on Sep 20 2022, 10:03 PM.

Details

Reviewers

kib
olce

Summary

commit 480ed6b1d254ef459f649a1a8422b81d8307ab5e
Author: Mateusz Guzik <mjg@FreeBSD.org>
Date:   Mon Apr 24 16:46:27 2023 +0000

    ufs: drop LK_SLEEPFAIL from ffs_sync
    
    It was added to handle identitity changes, but these no longer happen
    since vnodes became actually freeable.
    
    Even so, the change was bogus in that restarts could in principle keep
    happening.
    
    Note the issue is not fully fixed yet, but this gets rid of the only
    consumer of the flag to VOP_LOCK.

commit ac68c238252dbc76e1875e4ed9ecdb13e8241a3e
Author: Mateusz Guzik <mjg@FreeBSD.org>
Date:   Tue Sep 20 22:11:02 2022 +0000

    vfs: retire LK_INTERLOCK support from VOP_LOCK
    
    Passing the lock down buys nothing, but instead requires locking
    routines to be aware of its existence.
    
    It gets dropped immediately if the vnode lock is already held,
    meaning it does not protect the vnode from going away -- this is
    typically handled by having the caller vhold the vnode first.
    
    It does not prevent other threads from acquiring the vnode lock
    either as there are plenty of places which do it without acquiring
    the interlock first.

commit f4bd5f647b9dedd07e929d254d8838ad22456520
Author: Mateusz Guzik <mjg@FreeBSD.org>
Date:   Tue Sep 20 22:09:14 2022 +0000

    vfs: remove LK_INTERLOCK support from vget_finish
    
    Note it is kept for vget.

commit b4442f21867525e7262b20249d3be9bfbf56d16d
Author: Mateusz Guzik <mjg@FreeBSD.org>
Date:   Tue Sep 20 21:26:49 2022 +0000

    vfs: stop passing LK_INTERLOCK to VOP_LOCK
    
    Note this does not remove the support yet.

commit c18cd53852eb5f4886e5d0d5efefcd15f9ffc0fa
Author: Mateusz Guzik <mjg@FreeBSD.org>
Date:   Tue Sep 20 20:53:41 2022 +0000

    ufs: stop passing LK_INTERLOCK to vn_lock
    
    The feature is going away.

Test Plan

will ask pho

Diff Detail

Lint

Lint Skipped

Unit

Tests Skipped

Event Timeline

mjg requested review of this revision.Sep 20 2022, 10:03 PM

mjg created this revision.

What I expected to see in the summary, instead of the claim that the feature is going away, an explanation why the cases are still correct without using interlock.

mjg edited the summary of this revision. (Show Details)Sep 21 2022, 7:04 PM

mjg updated this revision to Diff 110846.Sep 22 2022, 10:56 AM

mjg retitled this revision from remove some LK_INTERLOCK uses to remove support for LK_INTERLOCK from VOP_LOCK.

mjg edited the summary of this revision. (Show Details)

mjg edited the test plan for this revision. (Show Details)

kib added inline comments.Sep 24 2022, 5:38 PM

sys/kern/vfs_default.c
528–529	It is rather pointless to keep the local after VI_MTX is no longer calculated
sys/kern/vfs_subr.c
3315–3316	VI_UNLOCK/VI_LOCK are not needed
3320–3321	From what I remember about LK_TRYUPGRADE, relocking interlock is not needed there as well.
3914	interlock is not needed at all

tidy ups
only allow interlock to be held for tryops

mjg marked 3 inline comments as done.Nov 5 2022, 1:15 AM

Consider LK_SLEEPFAIL | LK_INTERLOCK. If it returned success, then we can be sure that the VIRF_DOOMED was not set during the lock. I believe your patch breaks the guarantee.

It seems that the property is used at least by ffs snapshots. I was sure that it is used by SU as well, but it seems that SU uses this pattern only on buffers.

I don't remember if I mentioned, but the goal is to make the interface more similar to regular locks, so that something like sx can be employed as a vnode lock. Consequently LK_SLEEPFAIL also needs to go.

You are right about the change as is, but I think this is fixable.

The only use I found is in ffs_sync, where I'm confident it is bogus to begin with.

According the commit which introduced it:

commit 41d4783d494514df11aaeea38deba16ea3848606
Author: Jeff Roberson <jeff@FreeBSD.org>
Date:   Sun Apr 3 10:38:18 2005 +0000

     - In ffs_sync we need to pass LK_SLEEPFAIL in when we lock the vnode
       because it may change identities while we're sleeping on the lock.
       Otherwise we may bail out of ffs_sync() early due to an error from
       deadfs.

But if the vnode transitions to VBAD, there is nothing to do and the main loop can continue. Instead, the code restarts iterating from scratch. Similarly, if the code is allowed to sleep waiting on the lock, why restart squat if the vnode did not get doomed? This is a huge problem as a sufficiently unlucky locking pattern can turn this into an indefinite loop in principle. So how about this in concept (untested):

diff --git a/sys/ufs/ffs/ffs_vfsops.c b/sys/ufs/ffs/ffs_vfsops.c
index 8f6e186b44b2..36c49e51e432 100644
--- a/sys/ufs/ffs/ffs_vfsops.c
+++ b/sys/ufs/ffs/ffs_vfsops.c
@@ -1713,7 +1713,7 @@ ffs_sync(struct mount *mp, int waitfor)
        }
        if (waitfor == MNT_WAIT)
                lockreq = LK_EXCLUSIVE;
-       lockreq |= LK_INTERLOCK | LK_SLEEPFAIL;
+       lockreq |= LK_INTERLOCK | LK_RETRY;
 loop:
        /* Grab snapshot of secondary write counts */
        MNT_ILOCK(mp);
@@ -1742,11 +1742,9 @@ ffs_sync(struct mount *mp, int waitfor)
                        VI_UNLOCK(vp);
                        continue;
                }
-               if ((error = vget(vp, lockreq)) != 0) {
-                       if (error == ENOENT || error == ENOLCK) {
-                               MNT_VNODE_FOREACH_ALL_ABORT(mp, mvp);
-                               goto loop;
-                       }
+               vget(vp, lockreq);
+               if (VN_IS_DOOMED(vp))
+                       vput(vp);
                        continue;
                }
 #ifdef QUOTA

whack LK_SLEEPFAIL

Herald added a subscriber: imp. · View Herald TranscriptApr 24 2023, 4:50 PM

I whacked LK_SLEEPFAIL. As of this patchset, it is no longer legal to LK_INTERLOCK or LK_SLPEEFAIL to VOP_LOCK.

I don't think this regresses anything that I can see.

Your DOOMED state concern is already problem in the unpatched state, for example:

struct vnode *
__mnt_vnode_first_all(struct vnode **mvp, struct mount *mp)
{
        struct vnode *vp;

        *mvp = vn_alloc_marker(mp);
        MNT_ILOCK(mp);
        MNT_REF(mp);

        TAILQ_FOREACH(vp, &mp->mnt_nvnodelist, v_nmntvnodes) {
                /* Allow a racy peek at VIRF_DOOMED to save a lock acquisition. */
                if (vp->v_type == VMARKER || VN_IS_DOOMED(vp))
                        continue;
                VI_LOCK(vp);
                if (VN_IS_DOOMED(vp)) {
                        VI_UNLOCK(vp);
                        continue;
                }
                break;
        }

as in this skips the doomed vnodes on its own without any waits for state to settle.

I'm going to fix it, but not in time for 14.0

So why is the interlock support kept for vget()? I do not see it different from e.g. just doing the same VI_UNLOCK()/clearing LK_INTERLOCK in vn_lock() as you do in vget(), instead of the pass over all tree.

sys/fs/unionfs/union_vnops.c
1993	Why you removed MTX_DUPOK?
sys/kern/vfs_subr.c
5669
sys/ufs/ffs/ffs_vfsops.c
1743	Why the interlock was kept there?
1743	I forget to remove this comment when I wrote a more generic form in the global discussion.
sys/ufs/ffs/ffs_vnops.c
531