commit a0a4dd9dcb2c947aa0fb46f0f8641b13c8b9c694 Author: Mateusz Guzik <mjg@FreeBSD.org> Date: Mon Nov 7 23:43:30 2022 +0000 Stop sync restarts on vget failure due to doomed vnode All implementations look copy-pasted from ufs. Restart was there already in the initial BSD 4.4 Lite import, most likely used to damage control broken linkage should the target vnode "change identity". This definitely is not a problem.
Details
- Reviewers
kib
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
I do not think it is correct/intended to return from ffs_sync() if we saw and skipped a doomed vnode. It might be still dirty, and some other thread works on finishing syncing it. If we return ignoring the vnode, the VFS_SYNC contract is broken.
but dooming keeps the vnode locked the entire time, thus by that time vget succeds and sees ENOENT, there is nothing to do there.
I also note the loop can trivially race against a just remove vnode.
If you are worried about htel ock be temporarily dropped somewhere, the code can explicitly check for VBAD which is only set at the end.
The point is that, if you do not wait for unlock, dirty data might not reach the storage when ffs_sync() returns.
But the code *is* waiting for unlock with and without the patch. The difference is that the patched variant does not loop when it happens.
Excluding variants which pass LK_NOWAIT, in which case this does not matter.
Yes, I think we need to ensure that doomed vnode is completely reclaimed, before returning from the VFS_SYNC() implementations. Yes, at least UFS reclaim might drop the vnode lock, e.g. due to inactive->truncate->fsync->softdep locking dvp