Details

Reviewers

kib
markj
olce
pho

Commits

rG53a777bb52e3: unionfs: do not create a new status object during vop_close()
rGeb60ff1ee16a: unionfs: rework locking scheme to only lock a single vnode

Summary

Instead of locking both the lower and upper vnodes, which is both
complex and deadlock-prone, only lock the upper vnode, or the lower
vnode if no upper vnode is present.

In most cases this is all that is needed; for the cases in which
both vnodes do need to be locked, this change also employs deadlock-
avoiding techniques such as LK_NOWAIT and vn_lock_pair().

There are still some corner cases in which the current implementation
ends up taking multiple vnode locks across different filesystems
without taking special steps to avoid deadlock; those cases have
been noted in the comments.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

olce added inline comments.Jun 12 2024, 9:06 PM

sys/fs/unionfs/union_subr.c
392–414	I'm not sure there is any need for a fix in `unionfs_ins_cached_vnode()`. The current thread cannot have the containing directory's lower vnode locked (else, there is some violation of the cross-filesystem lock rule, basically not to establish any order, higher up in the call stack) because of systematic creation of shadow directories at lookup. (I also have to go back to my project to globally simplify locking of multiple vnodes in the VFS and double-check whether it can really work.) Removing the cache is primarily for semantics and lifecycle reasons, but indeed will have a huge simplification effect on several parts of unionfs. However, I don't think it makes much sense to do so without at the same time revising the semantics. I'll do both in the first phase of the plan.
418–432	Ah yes, dooming of the lower vnode would happen only on recursive unmount. This is also a paradigm that I will change. But that's true for the time being. I'd like to check what could happen nonetheless after the lower vnode is doomed, because as you say, even if `vflush()` is eventually called, we don't really know what "shortly" means, and I'd rather not rely on that.
488–491	A small comment explaining why `lvp` needs to be locked would be great. What about the `VV_ROOT` case?
853	Yes, that's a problem, and likely the cause of @pho 's deadlock.
1580–1582	I wouldn't say that `VOP_OPEN()` is not truly needed. The current state in the VFS is sadly completely ad-hoc, some filesystem expecting it and some others not, with various degrees of consequences. I'm not sure at this point if we will require `VOP_OPEN()`/`VOP_CLOSE()` around `VOP_READDIR()` or not at all, only that we'll have to make a consistent choice.
1656	`unionfs_check_rmdir()` should return `0` if the directory is (determined) empty or `ENOTEMPTY` if it is not, but only if the determination didn't get any error. Any error in each of these calls should lead to bailing out immediately with that error, and callers adapted to cope. This is not a new bug. However, the old code was quite close, but would in the end convert any error to `ENOTEMPTY`, while the new code steps farther away from that.
sys/fs/unionfs/union_vnops.c
222–224	`lvp` must be unlocked here.
262–264	Same as above for `uvp`.
627–629	Please keep this warning, as I don't think this case can occur (and should be fixed if it does).
826–827	(Too bad C is not Lispy enough...)
1718–1723	`VOP_RENAME()` seems to be a typo here. Did you mean `VOP_READDIR()`?
2232–2237	`VV_ROOT` case? (see `unionfs_noderem()`)

In D45398#1039744, @olce wrote:

Well, I'm almost done, but not completely (I need to finish reviewing unionfs_lock() and the lines below it). Here's a new round of comments/notes/requests.

Concerning unionfs_mkshadowdir(), I've added a series of suggested minor changes. These are mostly to clarify and simplify things (and, I suspect, marginally improve concurrency). You may as well forget about them if you think this is too much work for now (because, e.g., they would cause a new long round of testing).

The main problems I've spotted so far are the lack of some unlocks in unionfs_lookup() and the VV_ROOT case in unionfs_noderem() probably not handled correctly.

And, to correct something I've said:

In D45398#1039195, @olce wrote:

The best approach, and perhaps the only sensible one, I have seen so far is that unionfs has to completely rely on lower layer filesystems for all operations, and for this must get rid of any cache it maintains from data on them.

Well, that's true, but only for information that can be modified by the underlying filesystems, which I had in mind when writing that. However, identity of the vnodes themselves is not included in these, so in fact a cache based on vnodes identities (and v_hash) is perfectly feasible, and will indeed allow to avoid the duplicate vnode problem in the lookup, and elsewhere. I'll also have to check for other foreseen uses developed in the full proposal.

Use of the vnode pointer as the identifier was indeed what I had in mind when I originally wrote the comment.

jah added inline comments.Jun 13 2024, 1:33 AM

sys/fs/unionfs/union_subr.c
392–414	Technically, we don't proactively create shadow directories during lookup if the unionfs instance is mounted read-only (however unlikely it may be for a user to do that). But I'd also like to go back to the question I asked at the top of this review: how likely is it for this specific udvp->lvp locking sequence to present a practical risk of deadlock? In other words, how likely is it that some other agent will want to lock those two specific vnodes on different filesystems in the opposite order?
418–432	Actually I don't think this is a case we should encounter today. At this point, if the insmntque1() call hasn't already failed then the unionfs unmount operation's vflush() call should block on the lock we currently hold for the newly-allocated vnode. This should then prevent dooming of the underlying vnodes. Perhaps these VN_IS_DOOMED() checks should instead be converted to VNASSERTs?
488–491	This is really to simplify unionfs_lock() (per the comment there) by synchronizing against any pending lock operation that may still be using the lower vnode lock. That should never be the case for the root vnode, as the root vnode should always have both an upper and lower vnode for its entire lifecycle, so unionfs_lock() should never attempt to use the lower root vnode. Moreover, if we are unmounting a non-"below" unionfs mount, this vn_lock_pair() call will assert because lvp will already be locked (as it is the covered vnode). I'll add a comment to this effect.
927–929	These are error handling cases, so I wouldn't be very concerned about the extra locking cycles. But the other issue is that dvp may be reclaimed due to relocking of udvp in various places (unionfs_relookup, VOP_MKDIR, VOP_SETATTR). At that point, in locking terms udvp will no longer be an alias for dvp, so dvp must be explicitly relocked anyway. I do prefer your label naming and placement of vn_finished_write() though, so I'll change those.
sys/fs/unionfs/union_vnops.c
222–224	oops, good catch
262–264	good catch, should also branch to unionfs_lookup_cleanup.
627–629	This warning actually does occur, and in fact it's extremely spammy during unionfs stress2 testing. IIRC this is because vgonel() calls VOP_CLOSE() on any active vnode, regardless of whether anything had it open.
1718–1723	Indeed, that's what I meant.

jah added inline comments.Jun 13 2024, 1:41 AM

sys/fs/unionfs/union_vnops.c
627–629	Looking back over unionfs_get_node_status(), perhaps a better change would be to keep the warning but allow the close() path to use a NULL status object if one doesn't already exist, rather than allocating a new one through unionfs_get_node_status(). This would certainly be more efficient.

New round of review. I think I have looked at all the changes now, but may do an additional quick check pass tomorrow.

I'm proposing an alternate way to deal with cross-FS locking problems (in an inline response to your question; I'll elaborate later on the details, but they are pretty straightforward).

There is a remaining bug in unionfs_getextattr. The rest is answers to your latest responses, clarifications and requests for more comments explaining some code.

sys/fs/unionfs/union_subr.c
392–414	That's right, I forgot the read-only case (which I don't think is that unlikely), mostly because assuming that the containing directory has an upper vnode is so ingrained in the current code (and OK as an assumption as long as it is made as part of a write operation). This also is something that I'll likely change (time of creation for shadow directories should become configurable). Concerning the different FS deadlock risk, I have a hard time trying to find any useful scenario where two FSes would be combined in opposite ways. That said, foot shooting is easy, and an administrator may try to mount unionfs with layers swapped (I'm pretty sure there is even a PR about such a scenario), which could lead to deadlocks. And that is an unacceptable system behavior, however infrequent it is. However, this doesn't mean this has to be prevented at the level of vnodes. A deadlock avoidance scheme that would directly prevent such erroneous mounts seems a much better solution to me, and will allow to simplify a lot of code.
418–432	Since dooming of the underlying vnodes cannot happen before unmount of the union above, if `insmntque1()` doesn't return failure after having locked the target underlying vnode, the latter simply can't be doomed before it is unlocked. That means that the two `VN_IS_DOOMED()` calls can't return true, and the blocks below the `if`s are effectively dead code. Given the unionfs vnode is itself locked (through the appropriate underlying vnode), I agree that `lowervp` can't be doomed regardless it was locked or not because the unionfs vnode can't be doomed yet, given that `inmsntque1()` checks for `MNTK_UNMOUNT`, and `dounmount()` sets this flag first thing, so before dooming the unionfs vnode, which requires its lock. I think we should add at least an assertion about `lowervp`. To conclude, I'd just remove the `if`s, whose blocks are dead code, and would add assertions that both underlying vnodes are not doomed. And your changes here indeed do not disturb what callers see.
488–491	And, for a `below` mount, calling `vn_lock_pair()` on the root vnode would assert as well since the upper vnode's lock is locked recursively. Yes, a comment is warranted. Thanks.
986	Trailing whitespace.
988–1012	Trailing whitespace.
1661	Superfluous whitespace.
sys/fs/unionfs/union_vnops.c
627–629	Looking back over unionfs_get_node_status(), perhaps a better change would be to keep the warning but allow the close() path to use a NULL status object if one doesn't already exist, rather than allocating a new one through unionfs_get_node_status(). This would certainly be more efficient. Yes, I spotted that more than a year ago and still remember it. Would be fine. There is indeed the `VOP_CLOSE()` pairing problem in the VFS. I had some plan at some point, but mostly forgot it and will have to think about it again.
675–678	You have to reload `unp` here for the `ENOENT` case, else this may cause a crash or an access to the wrong structure from the code at `unionfs_open_abort`.
707	Trailing whitespace.
709–710	Trailing whitespace.
821–826	Same trailing whitespaces as in `unionfs_open()`.
1947–1949	Trailing whitespace.
2154	Trailing whitespace.
2228–2235	All this block seems superfluous. There's no need to tweak flags on `LK_UPGRADE` or `LK_DOWNGRADE`. These don't make sense if the calling thread hasn't already locked the vnode, and in this case the underlying lock object can only be changed by this very thread (by an operation that requires holding the lock exclusively and gives back the lock as it is), which means the current thread can always know the state of the lock correctly. Therefore, we should just obey it.
2230	Please add a comment explaining why this is needed, along the lines of: "In-between unlock of the interlock and lock of the lower vnode, another process has a window where it could also acquire the lower vnode lock and then doom the unionfs vnode (changing its lock to `v_lock`) or perform an operation leading to the creation of an upper vnode, which then would be the one to lock. So we have to check for this once the lower vnode is locked, and if so restart the locking operation."
2232–2237	`VV_ROOT` case? (see `unionfs_noderem()`) Answered in `unionfs_noderem()`, thanks. Please add a small comment saying that 'lvp_locked' is necessary false on `VV_ROOT`.
2232–2237	Mentioning `unionfs_node_update()` as well here and the additional guarantee we need below. (Proposed text has been justified by hand.)
2664–2667	`unp` must be reloaded and could be `NULL` here (as you handled it in `unionfs_setextattr`). Alternatively, test for `error` being `ENOENT`.

jah added inline comments.Jun 14 2024, 2:02 AM

sys/fs/unionfs/union_subr.c
392–414	You might be thinking of bug 172334, which landed on me fairly recently. If there is a general way to prevent such conflicting mounts at the time the mount is requested (and if that is in fact that only case that could lead to LOR in places like this), then I agree that would definitely be preferable to a vnode-centric solution.
sys/fs/unionfs/union_vnops.c
2228	'flags' should be passed here instead of ap->a_flags, otherwise we may miss LK_CANRECURSE set above. More generally, I'll clean up the flags handling in this function so that all updates are made against the 'flags' local, with ap->a_flags only modified in the one case in which we need to pass 'ap' along.
2228–2235	I think you're right that the LK_DOWNGRADE case is superfluous: that should instead be an assert, as already holding lvp's lock exclusive should guarantee that we don't first issue the downgrade request to lvp and then find that the unionfs vnode has been doomed or copied up. The LK_UPGRADE case is needed though: LK_UPGRADE may cause the lock to be temporarily dropped (e.g. consider the case in which the lock has multiple shared holders), creating a window in which dooming or copy-up is possible.

olce added inline comments.Jun 14 2024, 2:50 PM

sys/fs/unionfs/union_subr.c
392–414	Yes, exactly. I had listed that PR in the proposal document. I don't think we need to design/implement the mount dependencies deadlock avoidance scheme right now, as it isn't a prerequisite for the changes here. I'll address that point in the first phase as well.
927–929	But the other issue is that dvp may be reclaimed due to relocking of udvp in various places (unionfs_relookup, VOP_MKDIR, VOP_SETATTR). Right, I have been too optimistic with this. Given that, I agree it's not worth the complexity of trying to distinguish the case where `dvp` and `udvp` locks are still aliased to try to optimize locking. Optimizing locking is not just for performance, it also gives hints of locking expectations at some points, so can help understanding (or, at least, help ask the right questions).
984–987	Avoid an unlock/relock for `uvp` in the common case.
sys/fs/unionfs/union_vnops.c
2228–2235	Yes, the `LK_DOWNGRADE` case is superfluous for the reason you stated, so an assert would be better. I forgot about a possible lock drop in `LK_UPGRADE`. You're right then, the `LK_UPGRADE` case where `LK_EXCLUSIVE` is explicitly set is absolutely necessary (and the flag has to be explicit because we are changing the lock on which to operate, and we don't have this one locked).

jah added inline comments.Jun 14 2024, 3:49 PM

sys/fs/unionfs/union_subr.c
984–987	I think for this we'll want VOP_VPUT_PAIR(udvp, &uvp, false), as we'll be unlocking udvp while a lock is still held on its child.

olce added inline comments.Jun 14 2024, 4:05 PM

sys/fs/unionfs/union_subr.c
984–987	You're right, this is what, e.g., `kern_mkdirat()` does. The need to call `VOP_VPUT_PAIR()` (a misnomer) complicates the VFS interface in non-obvious ways (and is not documented at all). For a while, I've been having the impression that introduction of that operation may have been ill-advised, but to be fair have not spent enough time to see if a better alternative is possible yet.

unionfs: do not create a new status object during vop_close()
Review feedback from olce@, fix for hung process reported by pho@

Harbormaster completed remote builds in B58215: Diff 139944.Jun 18 2024, 6:05 AM

I ran tests with D45398.139944.patch for a day without observing any issues.

Currently reviewing the update.

Looks good overall!

I have added 3 more comments suggesting more (or changes in) code comments.

Thanks, and sorry for the delay.

sys/fs/unionfs/union_subr.c
930–935	Rephrasing suggestion for more clarity.
980–981	Reference to the comment above.
988–1012	I would add a comment saying that `uvp` could become doomed in `VOP_VPUT_PAIR()` regardless of `false` being passed as `unlock_vp`, but here can't because the lower filesystem cannot be in the process of being forced unmounted (and `VOP_MKDIR()` returned it with an active reference).

This revision is now accepted and ready to land.Jul 2 2024, 9:10 AM

Add and clarify some comments

This revision now requires review to proceed.Jul 2 2024, 10:10 PM

Harbormaster completed remote builds in B58458: Diff 140482.Jul 2 2024, 10:10 PM

Nit in a new comment.

sys/fs/unionfs/union_subr.c
981–984

This revision is now accepted and ready to land.Jul 3 2024, 8:49 AM

Avoid leaking a locked+referenced vnode in the dot-dot lookup case
Avoid returning a doomed vnode in the dot-dot lookup case
Simplify a few locking-related checks in unionfs_lookup()

This revision now requires review to proceed.Jul 5 2024, 4:12 PM

Harbormaster completed remote builds in B58509: Diff 140578.Jul 5 2024, 4:12 PM

@pho You may want to run unionfs19.sh against this patchset. I believe the "forward VOP" guards added in unionfs_lookup() will address the panic seen there.

In D45398#1046015, @jah wrote:

@pho You may want to run unionfs19.sh against this patchset. I believe the "forward VOP" guards added in unionfs_lookup() will address the panic seen there.

Sure, I'll do that.

I got this panic with union19.sh:

20240705 20:25:38 all (1/1): unionfs19.sh
VNASSERT failed: !__builtin_expect(((_Generic(*(&(dvp)->v_irflag), short: (*(volatile u_short *)(&(dvp)->v_irflag)), u_short: VNASSERT failed: !__builtin_expect(((_Generic(*(&(dvp)->v_irflag), short: (*(volatile u_short *)(&(dvp)->v_irflag)), u_short: (*(volatile u_short *)(&(dvp)->v_irflag))) & 0x0001) != 0), 0) not true at ../../../kern/vfs_cache.c:2481 (cache_enter_time)
VNASSERT failed: !__builtin_expect(((_Generic(*(&(dvp)->v_irflag), short: (*(volatile u_short *)(&(dvp)->v_irflag)), u_short: VNASSERT failed: !__builtin_expect(((_Generic(*(&(dvp)->v_irflag), short: (*(volatile u_short *)(&(dvp)->v_irflag)), u_short: (*(volatile u_short *)(&(dvp)->v_irflag))) & 0x0001) != 0), 0) not true at ../../../kern/vfs_cache.c:2481 (cache_enter_time)
VNASSERT failed: !__builtin_expect(((_Generic(*(&(dvp)->v_irflag), short: (*(volatile u_short *)(&(dvp)->v_irflag)), u_short: (*(volatile u_short *)(&(dvp)->v_irflag))) & 0x0001) != 0), 0) not true at ../../../kern/vfs_cache.c:2481 (cache_enter_time)
0xfffffe016f682bb8: (*(volatile u_short *)(&(dvp)->v_irflag))) & 0x0001) != 0), 0) not true at ../../../kern/vfs_cache.c:2481 (cache_enter_time)
(*(volatile u_short *)(&(dvp)->v_irflag))) & 0x0001) != 0), 0) not true at ../../../kern/vfs_cache.c:2481 (cache_enter_time)
0xfffffe016f6d04b0: 0xfffffe016f1aa068: type VBAD state VSTATE_DEAD op 0xffffffff818ac760
    usecount 2, writecount 0, refcount 1 seqc users 10xfffffe016ffba068: type VBAD state VSTATE_DEAD op 0xffffffff818ac760

    hold count flags ()
0xfffffe016fed0000:     flags (VIRF_DOOMED)type VBAD state VSTATE_DEAD op 0xffffffff818ac760
    usecount 2, writecount 0, refcount 1 seqc users 1type VBAD state VSTATE_DEAD op 0xffffffff818ac760
type VBAD state VSTATE_DEAD op 0xffffffff818ac760
    usecount 2, writecount 0, refcount 1 seqc users 1
    hold count flags ()

    lock type unionfs: EXCL by thread 0xfffff802f76c7740 (pid 92551, mmap, tid 100429)

    usecount 2, writecount 0, refcount 1 seqc users 1
    usecount 2, writecount 0, refcount 1 seqc users 1    hold count flags ()
    flags (VIRF_DOOMED)    hold count flags ()

    flags (VIRF_DOOMED)    flags (VIRF_DOOMED)#0 0xffffffff80b14d98 at lockmgr_lock_flags+0x1b8
    
    #1 0xffffffff8113416a at VOP_LOCK1_APV+0x3a
lock type unionfs: EXCL by thread 0xfffff802f72a0740 (pid 92547, mmap, tid 100442)
#2 0xffffffff80c5ae03 at _vn_lock+0x53

    hold count flags ()
    flags (VIRF_DOOMED)
    #3 0xffffffff80c60017 at vn_lock_pair+0x3d7

lock type unionfs: EXCL by thread 0xfffff806a0787740 (pid 92565, mmap, tid 100465)
#0 0xffffffff80b14d98 at lockmgr_lock_flags+0x1b8
#0 0xffffffff80b14d98 at lockmgr_lock_flags+0x1b8
#1 0xffffffff8113416a at VOP_LOCK1_APV+0x3a
lock type unionfs: EXCL by thread 0xfffff802132f4740 (pid 92567, mmap, tid 100436)
#1 0xffffffff8113416a at VOP_LOCK1_APV+0x3a
    #0 0xffffffff80b14d98 at lockmgr_lock_flags+0x1b8
#2 0xffffffff80c5ae03 at _vn_lock+0x53
#4 0xffffffff8284e086 at unionfs_forward_vop_finish_pair+0x176
#1 0xffffffff8113416a at VOP_LOCK1_APV+0x3a
#3 0xffffffff80c60017 at vn_lock_pair+0x3d7
#2 0xffffffff80c5ae03 at _vn_lock+0x53
#5 0xffffffff8285014d at unionfs_lookup+0x22d
#2 0xffffffff80c5ae03 at _vn_lock+0x53
#3 0xffffffff80c60017 at vn_lock_pair+0x3d7
#6 0xffffffff81130c0f at VOP_CACHEDLOOKUP_APV+0x5f
lock type unionfs: EXCL by thread 0xfffff8021314b740 (pid 92569, mmap, tid 100415)
#3 0xffffffff80c60017 at vn_lock_pair+0x3d7
#7 0xffffffff80c20c36 at vfs_cache_lookup+0xa6
#0 0xffffffff80b14d98 at lockmgr_lock_flags+0x1b8
#4 0xffffffff8284e086 at unionfs_forward_vop_finish_pair+0x176
#8 0xffffffff81130a4f at VOP_LOOKUP_APV+0x5f
#1 0xffffffff8113416a at VOP_LOCK1_APV+0x3a
#9 0xffffffff80c321c7 at vfs_lookup+0x487
#2 0xffffffff80c5ae03 at _vn_lock+0x53
#10 0xffffffff80c312e1 at namei+0x2d1
#4 0xffffffff8284e086 at unionfs_forward_vop_finish_pair+0x176
#11 0xffffffff80c50223 at kern_chdir+0xc3
#4 0xffffffff8284e086 at unionfs_forward_vop_finish_pair+0x176
#12 0xffffffff8106a8a8 at amd64_syscall+0x158
#5 0xffffffff8285014d at unionfs_lookup+0x22d
#13 0xffffffff8103bd0b at fast_syscall_common+0xf8
panic: condition !VN_IS_DOOMED(dvp) not met at ../../../kern/vfs_cache.c:2481 (cache_enter_time)
cpuid = 2
time = 1720203945
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe010874e780
vpanic() at vpanic+0x13f/frame 0xfffffe010874e8b0
panic() at panic+0x43/frame 0xfffffe010874e910
cache_enter_time() at cache_enter_time+0x145b/frame 0xfffffe010874e9f0
unionfs_lookup() at unionfs_lookup+0x7fa/frame 0xfffffe010874eb50
VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0x5f/frame 0xfffffe010874eb80
vfs_cache_lookup() at vfs_cache_lookup+0xa6/frame 0xfffffe010874ebd0
VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x5f/frame 0xfffffe010874ec00
vfs_lookup() at vfs_lookup+0x487/frame 0xfffffe010874ec80
namei() at namei+0x2d1/frame 0xfffffe010874ece0
kern_chdir() at kern_chdir+0xc3/frame 0xfffffe010874ee00
amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe010874ef30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe010874ef30
--- syscall (12, FreeBSDdf55a, rsp = 0x3a114032c808, rbp = 0x3a114032c810 ---

https://people.freebsd.org/~pho/stress/log/log0535.txt

In D45398#1046132, @pho wrote:

I got this panic with union19.sh:

20240705 20:25:38 all (1/1): unionfs19.sh
VNASSERT failed: !__builtin_expect(((_Generic(*(&(dvp)->v_irflag), short: (*(volatile u_short *)(&(dvp)->v_irflag)), u_short: VNASSERT failed: !__builtin_expect(((_Generic(*(&(dvp)->v_irflag), short: (*(volatile u_short *)(&(dvp)->v_irflag)), u_short: (*(volatile u_short *)(&(dvp)->v_irflag))) & 0x0001) != 0), 0) not true at ../../../kern/vfs_cache.c:2481 (cache_enter_time)
VNASSERT failed: !__builtin_expect(((_Generic(*(&(dvp)->v_irflag), short: (*(volatile u_short *)(&(dvp)->v_irflag)), u_short: VNASSERT failed: !__builtin_expect(((_Generic(*(&(dvp)->v_irflag), short: (*(volatile u_short *)(&(dvp)->v_irflag)), u_short: (*(volatile u_short *)(&(dvp)->v_irflag))) & 0x0001) != 0), 0) not true at ../../../kern/vfs_cache.c:2481 (cache_enter_time)
VNASSERT failed: !__builtin_expect(((_Generic(*(&(dvp)->v_irflag), short: (*(volatile u_short *)(&(dvp)->v_irflag)), u_short: (*(volatile u_short *)(&(dvp)->v_irflag))) & 0x0001) != 0), 0) not true at ../../../kern/vfs_cache.c:2481 (cache_enter_time)
0xfffffe016f682bb8: (*(volatile u_short *)(&(dvp)->v_irflag))) & 0x0001) != 0), 0) not true at ../../../kern/vfs_cache.c:2481 (cache_enter_time)
(*(volatile u_short *)(&(dvp)->v_irflag))) & 0x0001) != 0), 0) not true at ../../../kern/vfs_cache.c:2481 (cache_enter_time)
0xfffffe016f6d04b0: 0xfffffe016f1aa068: type VBAD state VSTATE_DEAD op 0xffffffff818ac760
    usecount 2, writecount 0, refcount 1 seqc users 10xfffffe016ffba068: type VBAD state VSTATE_DEAD op 0xffffffff818ac760

    hold count flags ()
0xfffffe016fed0000:     flags (VIRF_DOOMED)type VBAD state VSTATE_DEAD op 0xffffffff818ac760
    usecount 2, writecount 0, refcount 1 seqc users 1type VBAD state VSTATE_DEAD op 0xffffffff818ac760
type VBAD state VSTATE_DEAD op 0xffffffff818ac760
    usecount 2, writecount 0, refcount 1 seqc users 1
    hold count flags ()

    lock type unionfs: EXCL by thread 0xfffff802f76c7740 (pid 92551, mmap, tid 100429)

    usecount 2, writecount 0, refcount 1 seqc users 1
    usecount 2, writecount 0, refcount 1 seqc users 1    hold count flags ()
    flags (VIRF_DOOMED)    hold count flags ()

    flags (VIRF_DOOMED)    flags (VIRF_DOOMED)#0 0xffffffff80b14d98 at lockmgr_lock_flags+0x1b8
    
    #1 0xffffffff8113416a at VOP_LOCK1_APV+0x3a
lock type unionfs: EXCL by thread 0xfffff802f72a0740 (pid 92547, mmap, tid 100442)
#2 0xffffffff80c5ae03 at _vn_lock+0x53

    hold count flags ()
    flags (VIRF_DOOMED)
    #3 0xffffffff80c60017 at vn_lock_pair+0x3d7

lock type unionfs: EXCL by thread 0xfffff806a0787740 (pid 92565, mmap, tid 100465)
#0 0xffffffff80b14d98 at lockmgr_lock_flags+0x1b8
#0 0xffffffff80b14d98 at lockmgr_lock_flags+0x1b8
#1 0xffffffff8113416a at VOP_LOCK1_APV+0x3a
lock type unionfs: EXCL by thread 0xfffff802132f4740 (pid 92567, mmap, tid 100436)
#1 0xffffffff8113416a at VOP_LOCK1_APV+0x3a
    #0 0xffffffff80b14d98 at lockmgr_lock_flags+0x1b8
#2 0xffffffff80c5ae03 at _vn_lock+0x53
#4 0xffffffff8284e086 at unionfs_forward_vop_finish_pair+0x176
#1 0xffffffff8113416a at VOP_LOCK1_APV+0x3a
#3 0xffffffff80c60017 at vn_lock_pair+0x3d7
#2 0xffffffff80c5ae03 at _vn_lock+0x53
#5 0xffffffff8285014d at unionfs_lookup+0x22d
#2 0xffffffff80c5ae03 at _vn_lock+0x53
#3 0xffffffff80c60017 at vn_lock_pair+0x3d7
#6 0xffffffff81130c0f at VOP_CACHEDLOOKUP_APV+0x5f
lock type unionfs: EXCL by thread 0xfffff8021314b740 (pid 92569, mmap, tid 100415)
#3 0xffffffff80c60017 at vn_lock_pair+0x3d7
#7 0xffffffff80c20c36 at vfs_cache_lookup+0xa6
#0 0xffffffff80b14d98 at lockmgr_lock_flags+0x1b8
#4 0xffffffff8284e086 at unionfs_forward_vop_finish_pair+0x176
#8 0xffffffff81130a4f at VOP_LOOKUP_APV+0x5f
#1 0xffffffff8113416a at VOP_LOCK1_APV+0x3a
#9 0xffffffff80c321c7 at vfs_lookup+0x487
#2 0xffffffff80c5ae03 at _vn_lock+0x53
#10 0xffffffff80c312e1 at namei+0x2d1
#4 0xffffffff8284e086 at unionfs_forward_vop_finish_pair+0x176
#11 0xffffffff80c50223 at kern_chdir+0xc3
#4 0xffffffff8284e086 at unionfs_forward_vop_finish_pair+0x176
#12 0xffffffff8106a8a8 at amd64_syscall+0x158
#5 0xffffffff8285014d at unionfs_lookup+0x22d
#13 0xffffffff8103bd0b at fast_syscall_common+0xf8
panic: condition !VN_IS_DOOMED(dvp) not met at ../../../kern/vfs_cache.c:2481 (cache_enter_time)
cpuid = 2
time = 1720203945
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe010874e780
vpanic() at vpanic+0x13f/frame 0xfffffe010874e8b0
panic() at panic+0x43/frame 0xfffffe010874e910
cache_enter_time() at cache_enter_time+0x145b/frame 0xfffffe010874e9f0
unionfs_lookup() at unionfs_lookup+0x7fa/frame 0xfffffe010874eb50
VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0x5f/frame 0xfffffe010874eb80
vfs_cache_lookup() at vfs_cache_lookup+0xa6/frame 0xfffffe010874ebd0
VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x5f/frame 0xfffffe010874ec00
vfs_lookup() at vfs_lookup+0x487/frame 0xfffffe010874ec80
namei() at namei+0x2d1/frame 0xfffffe010874ece0
kern_chdir() at kern_chdir+0xc3/frame 0xfffffe010874ee00
amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe010874ef30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe010874ef30
--- syscall (12, FreeBSDdf55a, rsp = 0x3a114032c808, rbp = 0x3a114032c810 ---

https://people.freebsd.org/~pho/stress/log/log0535.txt

Huh, I ran unionfs19 for a few hours and didn't see that, but it is definitely a problem in the updated code. Should be a straightforward fix, I'll post an update shortly.

In my experience, it is not uncommon for the test environment to have a big impact on finding errors.

Use the common cleanup path upon completion of dot-dot lookup This simplifies the code and avoids attempted creation of a negative dot-dot VFS cache entry for a doomed vnode.

Harbormaster completed remote builds in B58532: Diff 140613.Jul 5 2024, 11:44 PM

jah added inline comments.Jul 6 2024, 4:40 AM

sys/fs/unionfs/union_vnops.c
171	I find it very odd that unionfs_lookup() went to the trouble of explicitly creating a negative cache entry in the ENOENT case for dot-dot lookup but does not create a positive entry for successful dot-dot lookup. I don't see any apparent reason for this behavior.

I ran tests with D45398.140613.patch for 9 hours without seeing any problems.

Just a note that the last two diffs were to fix an unrelated, pre-existing bug. I completely understand it's easier to add to this review, especially for a small change.

I think you should commit this diff as is now, and if further testing shows other (and unrelated) problems, we'll fix them separately.

I do not intend to work very hard on unionfs and related subjects until the second half of September (lots of things on my plate up to then and going to be off most of August), so feel free to continue working on it, I'll review what you come out with.

Thanks.

This revision is now accepted and ready to land.Jul 9 2024, 1:26 PM

Closed by commit rGeb60ff1ee16a: unionfs: rework locking scheme to only lock a single vnode (authored by jah). · Explain WhyJul 13 2024, 1:40 AM

This revision was automatically updated to reflect the committed changes.

jah added a commit: rGeb60ff1ee16a: unionfs: rework locking scheme to only lock a single vnode.

jah added a commit: rG53a777bb52e3: unionfs: do not create a new status object during vop_close().

unionfs: rework locking scheme to only lock a single vnode
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 140842

sys/fs/unionfs/union.h

sys/fs/unionfs/union_subr.c

sys/fs/unionfs/union_vfsops.c

sys/fs/unionfs/union_vnops.c

unionfs: rework locking scheme to only lock a single vnodeClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 140842

sys/fs/unionfs/union.h

sys/fs/unionfs/union_subr.c

sys/fs/unionfs/union_vfsops.c

sys/fs/unionfs/union_vnops.c

unionfs: rework locking scheme to only lock a single vnode
ClosedPublic
Actions

Revision Contents
Changeset List