Page MenuHomeFreeBSD

fusefs: fix VOP_READDIR problems for NFS-exported FUSE file systems
ClosedPublic

Authored by asomers on Jan 3 2022, 12:41 AM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Apr 10, 5:25 AM
Unknown Object (File)
Sun, Apr 7, 11:50 PM
Unknown Object (File)
Mar 16 2024, 8:05 AM
Unknown Object (File)
Mar 11 2024, 3:56 AM
Unknown Object (File)
Feb 21 2024, 6:15 PM
Unknown Object (File)
Feb 17 2024, 6:22 AM
Unknown Object (File)
Feb 1 2024, 10:44 PM
Unknown Object (File)
Dec 12 2023, 10:13 PM
Subscribers

Details

Summary

Fix NFS exports of FUSE file systems for big directories

The FUSE protocol does not require that a directory entry's d_off field
outlive the lifetime of its directory's file handle. Since the NFS
server must reopen the directory on every VOP_READDIR call, that means
it can't pass uio->uio_offset down to the FUSE server. Instead, it must
read the directory from 0 each time. It may need to issue multiple
FUSE_READDIR operations until it finds the d_off field that it's looking
for. That was the intention behind SVN r348209 and r297887, but a logic
bug prevented subsequent FUSE_READDIR operations from ever being issued,
rendering large directories incompletely browseable.

MFC after: 3 weeks

fusefs: optimize NFS readdir for FUSE_NO_OPENDIR_SUPPORT

In its lowest common denominator, FUSE does not require that a directory
entry's d_off field is valid outside of the lifetime of the directory's
FUSE file handle. But since NFS is stateless, it must reopen the
directory on every call to VOP_READDIR. That means reading the
directory all the way from the first entry. Not only does this create
an O(n^2) condition for large directories, but it can also result in
incorrect behavior if either:

  • The file system _does_ change the d_off field for the last directory entry previously seen by NFS, or
  • The file system deletes the last directory entry previously seen by NFS.

Handily, for file systems that set FUSE_NO_OPENDIR_SUPPORT d_off is
guaranteed to be valid for the lifetime of the directory entry, there is
no need to read the directory from the start.

MFC after: 3 weeks

fusefs: require FUSE_NO_OPENDIR_SUPPORT for NFS exporting

FUSE file systems that do not set FUSE_NO_OPENDIR_SUPPORT do not
guarantee that d_off will be valid after closing and reopening a
directory. That conflicts with NFS's statelessness, that results in
unresolvable bugs when NFS reads large directoryes, if:

  • The file system _does_ change the d_off field for the last directory entry previously returned by VOP_READDIR, or
  • The file system deletes the last directory entry previously seen by NFS.

Rather than doing a poor job of exporting such file systems, it's better
just to refuse.

Even though this is technically a breaking change, 13.0-RELEASE's
NFS-FUSE support was bad enough that an MFC should be allowed.

MFC after: 3 weeks.

Test Plan

Manual testing with bfffs, fuse-ext2, and lklfuse, NFSv3 and NFSv4.2

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

  • Prohibit exporting file systems during VOP_VPTOFH, not VOP_MOUNT

@rmacklem will you be able to review this PR? I'd like to get it into FreeBSD 13.1.

Looks ok, if I understood what the patch does.
Basically, instead of reading a directory from the
beginning of it, it simply refuses to export the file
system unless it has the FSESS_NO_OPENDIR_SUPPORT
property, which means the cookies remain valid.

If that is basically what the patch does, it seems fine to me.

This revision is now accepted and ready to land.Feb 2 2022, 9:03 PM

Yep, that's exactly the idea.