Page MenuHomeFreeBSD

FUSE: Respect userspace FS "do-not-cache" of file attributes
AbandonedPublic

Authored by cem on Feb 5 2019, 8:30 PM.

Details

Reviewers
None
Summary

The FUSE protocol demands that kernel implementations cache user filesystem
file attributes (vattr data) for a maximum period of time in the range of
[0, ULONG_MAX] seconds. In practice, typical requests are for 0, 1, or 10
seconds; or "a long time" to represent indefinite caching.

Historically, FreeBSD FUSE has ignored this client directive entirely. This
works fine for local-only filesystems, but causes consistency issues with
multi-writer network filesystems.

For now, respect 0 second cache TTLs and do not cache such metadata.
Non-zero metadata caching TTLs in the range [0.000000001, ULONG_MAX] seconds
are still cached indefinitely, because it is unclear how a userspace
filesystem could do anything sensible with those semantics even if
implemented.

In the future, as an optimization, we should implement notify_inval_entry,
etc, which provide userspace filesystems a way of evicting the kernel cache.

One potentially bogus access to invalid cached attribute data was left in
fuse_io_strategy. It is restricted behind the undocumented and non-default
"vfs.fuse.fix_broken_io" sysctl or "brokenio" mount option; maybe these are
deadcode and can be eliminated?

Some minor APIs changed to facilitate this:

  1. Attribute cache validity is tracked in FUSE inodes ("fuse_vnode_data").
  1. cache_attrs() respects the provided TTL and only caches in the FUSE

inode if TTL > 0. It also grows an "out" argument, which, if non-NULL,
stores the translated fuse_attr (even if not suitable for caching).

  1. FUSE VTOVA(vp) returns NULL if the vnode's cache is invalid, to help

avoid programming mistakes.

  1. A VOP_LINK check for potential nlink overflow prior to invoking the FUSE

link op was weakened (only performed when we have a valid attr cache). The
check is racy in a multi-writer network filesystem anyway -- classic TOCTOU.
We have to trust any userspace filesystem that rejects local caching to
account for it correctly.

PR: 230258 (partial)

Diff Detail

Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 22358
Build 21533: arc lint + arc unit