Page MenuHomeFreeBSD

Rework v_object lifecycle for vnodes.
ClosedPublic

Authored by kib on Aug 25 2019, 1:47 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Mar 18, 11:51 AM
Unknown Object (File)
Feb 17 2024, 10:10 PM
Unknown Object (File)
Jan 30 2024, 3:58 PM
Unknown Object (File)
Jan 30 2024, 1:04 PM
Unknown Object (File)
Dec 20 2023, 5:53 AM
Unknown Object (File)
Dec 13 2023, 7:13 AM
Unknown Object (File)
Dec 10 2023, 10:39 AM
Unknown Object (File)
Nov 30 2023, 12:54 AM

Details

Summary

Current implementation of vnode_create_vobject()/vnode_destroy_vobject() is written so that it allows object destruction for live vnode. Practically, no filesystems use this, except for some remnants present in UFS. One of the consequences of that model is that each filesystem must call vnode_destroy_vobject() in VOP_RECLAIM() or earlier, as result all of them get rid of the v_object in reclaim.

Move the call to vnode_destroy_vobject() to vgonel() before VOP_RECLAIM(). This makes v_object stable: either the object is NULL, or it is valid vm object till the vnode reclamation. Remove code from vnode_create_vobject() to handle races with the parallel destruction.

For UFS, in ffs_valloc(), force reclaim existing vnode on inode reuse, instead of trying to re-initialize the same vnode for new purposes.

[UFS change is really a blocker and must be handled before the rest, I am posting this for discussion and testing]

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 26107

Event Timeline

sys/ufs/ffs/ffs_alloc.c
1142

"force allocation of a new vnode..."

1148

Don't you need to unlock *vpp?

sys/vm/vnode_pager.c
186

Should we assert obj->type == OBJT_VNODE here?

vput ufs vnode after vgone.
add assert.

sys/ufs/ffs/ffs_alloc.c
1152

It looks to me like you could instead define FFSV_REPLACE to perform the vgone() in ffs_vgetf(), but it is just a suggestion.

sys/vm/vnode_pager.c
157

I believe the OBJ_DISCONNECTWNT flag is now unused.

235–236

Delete the "MPSAFE" while here?

271

Can you explain this race? Does it arise from parallel open()s? I do not see why we want to destroy the object and retry in this case.

kib marked 7 inline comments as done.Aug 26 2019, 3:42 PM
kib added inline comments.
sys/vm/vnode_pager.c
271

The vnode_pager_alloc() function may be called with vnode only share-locked. This can happen e.g. for UFS read-only opens indeed, or for ZFS any opens, or for UFS directories when lookup is performed.

Since these operations can occur in parallel, we might allocate two vnode objects for the vnode. In this case, allocators synchronize on the vnode interlock to assign to vp->v_object, and one thread wins. I do not see what else the loosing thread can do except to free now unneeded object. goto retry is just to ensure that winner succesfully set v_object, and we can bump ref count on it as normal flow.

kib marked an inline comment as done.

Rework FFS bits. Other minor tweaks.

sys/ufs/ffs/ffs_vfsops.c
1673

We should probably assert (flags & LK_EXCLUSIVE) != 0 if FFSV_REPLACE is set.

1697

Hmm, so isn't it possible for two threads to race to allocate the same inode, and the losing thread calls vgone() on the newly created vnode?

sys/vm/vnode_pager.c
271

I see, for some reason I read the code as destroying the pre-existing vp->v_object.

kib marked 3 inline comments as done.Aug 26 2019, 4:48 PM
kib added inline comments.
sys/ufs/ffs/ffs_vfsops.c
1697

It is only possible for fully allocated inode, I believe. If inode is not yet allocated, only one thread can successfully allocate it, and then no other thread should instantiating the vnode. In other words, the described race is vald, but not for FFSV_REPLACE.

kib marked an inline comment as done.

Assert that FFSV_REPLACE requests exclusive vnode lock.

markj added inline comments.
sys/ufs/ffs/ffs_vfsops.c
1697

Is it valid to add the following assertion below?

error = vfs_hash_insert(...);
if (error != 0)
    return (error);
if (*vpp != NULL) {
    MPASS((ffs_flags & FFSV_REPLACE) == 0);
    return (0);
}
This revision is now accepted and ready to land.Aug 26 2019, 4:56 PM
kib marked an inline comment as done.Aug 26 2019, 5:35 PM
kib added inline comments.
sys/ufs/ffs/ffs_vfsops.c
1697

I think yes, it should be correct. I added it with some comment explaining the assumption about freshly allocated inode.

kib marked an inline comment as done.

Add assertions (LK_EXCLUSIVE now for real).

This revision now requires review to proceed.Aug 26 2019, 5:35 PM
This revision is now accepted and ready to land.Aug 26 2019, 6:54 PM

Peter, could you, please, start testing this ?

In D21412#466484, @kib wrote:

Peter, could you, please, start testing this ?

Sure.

mckusick added a subscriber: mckusick.

This change looks good to me modulo minor inline comment.

This revision now requires changes to proceed.Aug 27 2019, 6:45 PM

My inline comment seems to have been lost. I requested that ufs_prepare_reclaim() be made static.

sys/ufs/ufs/ufs_inode.c
183

The ufs_prepare_reclaim() function should be made static.

kib marked an inline comment as done.Aug 28 2019, 10:16 AM
kib added inline comments.
sys/ufs/ufs/ufs_inode.c
183

I removed the function at all. It no longer makes sense since it is single-use.

kib marked an inline comment as done.

Do not destroy v_object for vnodes which bypass.
Inline ufs_prepare_reclaim().

Removing the function is right approach. All looks good.

This revision is now accepted and ready to land.Aug 28 2019, 8:00 PM

I ran a full stress2 test on i386 with D21412.61303.diff.
I ran tests on amd64 with D21412.61396.diff for two hours.
No problems seen.

This revision was automatically updated to reflect the committed changes.