Page MenuHomeFreeBSD

unionfs: implement vnode-based cache lookup
ClosedPublic

Authored by jah on Oct 17 2021, 6:40 AM.
Tags
None
Referenced Files
Unknown Object (File)
Dec 13 2024, 1:59 PM
Unknown Object (File)
Dec 9 2024, 1:33 PM
Unknown Object (File)
Sep 10 2024, 10:42 PM
Unknown Object (File)
Sep 10 2024, 7:19 AM
Unknown Object (File)
Sep 4 2024, 2:15 PM
Unknown Object (File)
Aug 1 2024, 8:10 PM
Unknown Object (File)
Jul 29 2024, 3:15 PM
Unknown Object (File)
Jun 29 2024, 9:16 PM
Subscribers

Details

Summary

unionfs uses a per-directory hashtable to cache subdirectory nodes.
Currently this hashtable is looked up using the directory name, but
since unionfs nodes aren't removed from the cache until they're
reclaimed, this poses some problems. For example, if a directory is
created on a unionfs mount shortly after deleting a previous directory
with the same path, the cache may end up reusing the node for the
previous directory, including its upper/lower FS vnodes. Operations
against those vnodes with then likely fail because the vnodes
represent deleted files; for example UFS will reject VOP_MKDIR()
against such a vnode because its effective link count is 0. This may
then manifest as e.g. mkdir(2) or open(2) returning ENOENT for an
attempt to create a file under the re-created directory.

While it would be possible to fix this by explicitly managing the
name-based cache during delete or rename operations, or by rejecting
cache hits if the underlying FS vnodes don't match those passed to
unionfs_nodeget(), it seems cleaner to instead hash the unionfs nodes
based on their underlying FS vnodes. Since unionfs prefers to operate
against the upper vnode if one is present, the lower vnode will only
be used for hashing as long as the upper vnode is NULL. This should
also make hashing faster by eliminating string traversal and using
the already-computed hash index stored in each vnode.

While here, fix a couple of other cache-related issues:

--Remove 8 bytes of unnecessary baggage from each unionfs node by

getting rid of the stored hash mask field.  The mask is knowable
at compile time.

--When a matching node is found in the cache, reference its vnode

using vrefl() while still holding the vnode interlock.  Previously
unionfs_nodeget() would vref() the vnode after the interlock was
dropped, but the vnode may be reclaimed during that window.  This
caused intermittent panics from vn_lock(9) during unionfs stress
testing.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jah requested review of this revision.Oct 17 2021, 6:40 AM

I no longer see the message "mkdir.c:120: No such file or directory" with D32533.96980.diff applied.

sys/fs/unionfs/union_subr.c
157

Extra ()

186

Extra ()

196

It looks somewhat strange, unionfs_get_cached_vnode{,_locked} works on either lower or upper vnode. Don't we always know are we asking about lower or upper?

If yes, I propose to explicitly request either lower or upper.

sys/fs/unionfs/union_subr.c
196

I considered doing that when I wrote the patch, but IMO it makes the code unnecessarily complex. unionfs_get_cached_vnode() locked would need to take in a flag and use that to select comparison against un_uppervp vs. un_lowervp. A given vnode can't be both an upper and lower vnode for different unionfs vnodes on the same mount, so it should never be possible to get a false positive by checking the 'wrong' field.

sys/fs/unionfs/union_subr.c
816–821

Does this work when QUEUE_MACRO_DEBUG_TRASH is defined?

markj added inline comments.
sys/fs/unionfs/union_subr.c
225

Not to be addressed in this review specifically, but I think it'd be handy to have a helper

UNIONNODEVP(unp) ((unp)->un_uppervp != NULL ? (unp)->un_uppervp : (unp)->un_lowervp)

This conditional appears frequently, it seems.

816–821

I think so, unionfs_noderem() explicitly clears the linkage pointers. Maybe it'd be nicer to have a flag.

This revision is now accepted and ready to land.Oct 21 2021, 1:38 PM
sys/fs/unionfs/union_subr.c
816–821

Yes, it does work; I've been testing with QUEUE_MACRO_DEBUG_TRASH. unionfs_noderem() clears the current vnode's linkage (through unionfs_rem_cached_vnode()) and also clears out the vnode's hashtable along with the linkages of any cached subdirectory nodes. I believe this second part really exists to handle the forced unmount case in which vgonel() may be called on a vnode with a non-zero usecount, because otherwise the subdirectory references on the vnode would ensure that unionfs_noderem() isn't called until v_usecount reaches 0. If it weren't for that I think we could change these checks to asserts, because otherwise any in-use non-root unionfs VDIR node should be in the cache.

I do think the logic to clear the hashtable in unionfs_noderem() should be under the vnode interlock though.

kib added inline comments.
sys/fs/unionfs/union_subr.c
809

Should the upper vnode lock be taken before switching v_vnlock?

816–821

Yes, this should be a flag.

sys/fs/unionfs/union_subr.c
809

Yes, I think so. Otherwise we're effectively breaking the lock for a brief window.
I'll make this change and the other style/locking issues raised here in a series of follow-on commits.

This revision was automatically updated to reflect the committed changes.