Page MenuHomeFreeBSD

fusefs: fix a vnode locking bug during VOP_READ
Needs ReviewPublic

Authored by asomers on Fri, Jan 23, 9:32 PM.

Details

Reviewers
markj
vishwin
Summary

If a fuse file system is mounted with "-o async", then VOP_READ may be
called with the vnode locked in shared mode rather than exclusive mode.
But if the vnode's attribute cache has expired, then the kernel will
attempt to update it. That triggers an vnode-not-exclusively-locked
assertion.

Fix the bug by attempting to upgrade the shared lock to an exclusive
lock. If the upgrade fails, simply don't cache the new attributes. The
attribute cache will remain stale, and a subsequent read (or other
operation) on the same vnode will attempt to fetch the attributes and
update the cache again.

This bug has been present for a long time, but the assertion was
disabled on stock kernels (including with INVARIANTS) until 15.0.

PR: 291064
Reported by: markj, vishwin, groenveld@acm.org
MFC after: 2 weeks
Sponsored by: ConnectWise

Test Plan

test case added

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 70090
Build 66973: arc lint + arc unit

Event Timeline

Is this the only place where the update of the cache under the shared vnode lock occurs?
Or even better, can you point me to the single entry point of the cache update code?

That said, there is much better strategy to solve the issue than trying to upgrade the lock.
Add one more mutex (or sx, I am not sure about ops that are done for cache update), and lock it around the cache update code if the vnode lock is only share-locked.

I did not coded it myself because I wanted an easy experiment if the exclusive locking would solve the problem. Next, I do not know where the cache update code is.

sys/fs/fuse/fuse_internal.c
337

This is style violation

In D54850#1253849, @kib wrote:

Is this the only place where the update of the cache under the shared vnode lock occurs?
Or even better, can you point me to the single entry point of the cache update code?

I'm auditing now to look for other cases. I think I've found one in fuse_internal_invalidate_inode (which is used by very few real-world FUSE file systems, but has coverage in the test suite).

That said, there is much better strategy to solve the issue than trying to upgrade the lock.
Add one more mutex (or sx, I am not sure about ops that are done for cache update), and lock it around the cache update code if the vnode lock is only share-locked.

I did not coded it myself because I wanted an easy experiment if the exclusive locking would solve the problem. Next, I do not know where the cache update code is.

I could do that. In terms of performance, how does mutex compare to the vnode lock?

In D54850#1253849, @kib wrote:

Is this the only place where the update of the cache under the shared vnode lock occurs?
Or even better, can you point me to the single entry point of the cache update code?

I'm auditing now to look for other cases. I think I've found one in fuse_internal_invalidate_inode (which is used by very few real-world FUSE file systems, but has coverage in the test suite).

That said, there is much better strategy to solve the issue than trying to upgrade the lock.
Add one more mutex (or sx, I am not sure about ops that are done for cache update), and lock it around the cache update code if the vnode lock is only share-locked.

I did not coded it myself because I wanted an easy experiment if the exclusive locking would solve the problem. Next, I do not know where the cache update code is.

I could do that. In terms of performance, how does mutex compare to the vnode lock?

It is really about the code correctness. With that approach, you would not have incorrect or stale cache.

Mutexes and sx'es are same for uncontested cases, the difference is only when contention occurs. Both use single atomic for lock and unlock if lock is readily available.
Mutexes queue waiting (blocked in FreeBSD terminology) threads on turnstiles, which do priority propagation to avoid priority inversion. On the other hand, sx uses sleepqueues for waiting (sleeping in FreeBSD terminology) threads which do not propagate priority but allow for unbound sleeps. The selection for exclusive locking primitive is usually mutex unless code needs to sleep under the lock, either directly as in msleep(9) or indirectly by using other locks that sleep, e.g. sx.