Page MenuHomeFreeBSD

fusefs: set d_off during VOP_READDIR
ClosedPublic

Authored by asomers on Fri, Feb 12, 1:04 AM.

Details

Summary

This allows lseek to be used with lseek to position the file so that
getdirentries(2) will return the next entry. It is no used by readdir(3).

PR: 253411
Reported by: John Millikin <jmillikin@gmail.com>
MFC after: 1 week

Test Plan

test case added

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

In the commit summary, is this a typo?

This allows lseek to be used with lseek to...

I don't understand the sentence. Is the 2nd lseek FUSE?

It is no used by readdir

typo "no" -> "not"?

sys/fs/fuse/fuse_internal.c
623–624

bytesavail is a bad name for this. Maybe oreclen for output reclen?

640

So, for FreeBSD, I think d_off is expected to be the offset of the *next* dirent in the logical stream. I.e., d_off = nbytes_already_emitted + bytesavail.

It's a little weird for us because we're translating between a FUSE-formatted stream and a dirent-formatted output. I think normally I'd say offsets and seeks on the stream from FreeBSD should be in terms of the dirent output rather than the fuse_dirent input, but that makes lseek from userspace to FUSE basically impossible (with some manual cache internal to FUSE -- ugly).

I think we should (1) keep a running total of nbytes_already_read = sum(freclen), and (2) validate that fudge->off == nbytes_already_read + freclen (or whatever is appropriate for FUSE semantics -- maybe it should just be nbytes_already_read) to verify the internal consistency of the FUSE dirent stream we're getting from the untrusted server.

662–663

This check should go before the if (*fnd_start != 0) fill clause -- startoff is inclusive.

tests/sys/fs/fusefs/Makefile
73–75

GENERIC_DIRSIZ / DIRLEN is guaranteed to produce 8-byte aligned records, and the struct only requires 8-byte alignment on every arch. It might be difficult to persuade the compiler of that, but you could try adding alignment assertions.

tests/sys/fs/fusefs/readdir.cc
224

Zero is a valid fd. (ASSERT_GT(fd, 0) is more legible to me, but whatever.)

227–229

Maybe validate reclen?

asomers added inline comments.
sys/fs/fuse/fuse_internal.c
623–624

Yeah, I agree.

640

You're making assumptions about what d_off means. What you describe makes sense if the stream is a compact sequence of dirents, as it might be for UFS. But the FUSE server is allowed to interpret d_off however it wishes. I don't think we have any choice but to accept whatever the server tells us. The worst that happens is that the user ends up seeking to an invalid offset. But only the server knows which offsets are invalid.

662–663

No, the existing placement is correct. fudge->off is the seek offset of the _next_ dirent, not the current one.

tests/sys/fs/fusefs/Makefile
73–75

I would rather not. Even if it works, I think it would make the code confusing, and confusing is bad in test cases.

asomers marked 4 inline comments as done.
  • fusefs: set d_off during VOP_READDIR
  • Respond to cem's review comments.
sys/fs/fuse/fuse_internal.c
640

Ok

662–663

I'm not sure what I was thinking, you're right.

tests/sys/fs/fusefs/Makefile
73–75

I suppose it could be done in a confusing way, but I don't think it has to be. Tests document assumptions, including the behavior of primitives we rely on. The -Wno-cast-align is a red flag that suggests this code may only work on x86.

Try it? If you've made a good effort and it cannot be done, that's one thing, but if we haven't tried I don't think it's valid to reject.

tests/sys/fs/fusefs/Makefile
73–75

I can't get it to work by adding assertions. The best I can do is to add an alignment macro, and then assert that the alignment macro did nothing. And that seems silly:

	de1p = &(buf[de0->d_reclen]);
	de1 = (struct dirent*)_ALIGN(de1p);	/* Suppress Wcast-align */
	ASSERT_EQ((char*)de1, de1p);
tests/sys/fs/fusefs/Makefile
73–75

Yeah. What bugs me here is that the compiler clearly knows that buf is aligned, because the similar cast:

de0 = (struct dirent*)&buf[0];

doesn't produce a warning, despite clearly increasing alignment requirement from 1 to 8 as well.

But any sort of assertion on reclen or whatever doesn't seem to result in it grokking that the next record is also aligned. So it's a shortcoming in the warning analysis in Clang. Bleh.

For what it's worth, casting through (void*) defeats the warning, although obviously it doesn't do anything for actually ensuring the alignment.

Feel free to keep the -Wno-cast-align

tests/sys/fs/fusefs/Makefile
73–75

I'm guessing that the compiler doesn't complain about de0 since buf is stack-allocated. Either that ensures 8-byte alignment, or the compiler forces 8-byte alignment because it knows about the cast. I think I'll just keep the -Wno-cast-align then. Is there anything else I need to do to this PR?

cem added inline comments.
tests/sys/fs/fusefs/Makefile
73–75

Yeah, one way or another Clang knows that buf is 8-byte aligned. It fails to propagate that information when adding values that are multiples of 8 to the base pointer somehow.

No, nothing else, looks good to me.

This revision is now accepted and ready to land.Sat, Feb 13, 4:44 AM
This revision was automatically updated to reflect the committed changes.