fusefs: implement FUSE_COPY_FILE_RANGE.
ClosedPublic
Actions

Authored by asomers on Dec 29 2020, 8:16 PM.

Details

Reviewers

cem
emaste

Commits

rG92bbfe1f0d1f: fusefs: implement FUSE_COPY_FILE_RANGE.
rGae39db74066a: fusefs: fix an expectation in one of the tests

Summary

fusefs: fix an expectation in one of the tests

An order-of-operations problem caused an expectation intended for
FUSE_READ to instead match FUSE_ACCESS. Surprisingly, only one test
case was affected.

fusefs: update FUSE protocol to 7.28

None of the new optional features are implemented yet.

fusefs: implement FUSE_COPY_FILE_RANGE.

MFC after: 2 weeks
Relnotes: yes

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

asomers created this revision.Dec 29 2020, 8:16 PM

Herald added a subscriber: imp. · View Herald TranscriptDec 29 2020, 8:16 PM

asomers requested review of this revision.Dec 29 2020, 8:16 PM

Harbormaster completed remote builds in B35754: Diff 81324.Dec 29 2020, 8:16 PM

Add missing file

Harbormaster completed remote builds in B35755: Diff 81325.Dec 29 2020, 8:18 PM

cem accepted this revision.Dec 30 2020, 4:22 PM

cem added inline comments.

sys/fs/fuse/fuse_internal.c
1244–1246 ↗	(On Diff #81325)	What's the locking protocol for this function's vnode? Should we have an ASSERT_VOP_LOCKED/ELOCKED/UNLOCKED? Edit: and now I see this function was just moved from another file. I'll leave the questions in case they're interesting.
1256–1257 ↗	(On Diff #81325)	For these kind of things, I would generally early-return to save the horizontal compression forced on the meat of the function. But that's just personal preference; this is fine.
1266–1269 ↗	(On Diff #81325)	I'm not sure this makes sense. I think we want to do something if the setattr fails. (Side note: I forget what fuse_internal_setattr does or why it might fail. I might be imagining a mode that doesn't make sense. Just tell me if I'm wrong :-).) I think we really don't want to leak a SUID vnode written by non-root into the global namespace. Right? If setattr fails, we need to nuke the vnode, or something extreme like that, to maintain the integrity of our security model. You could imagine a malicious FUSE server program that accepted writes to suid programs from certain unprivileged users and produced errors on subsequent setattrs, to enable a backdoor of the ordinary security model. Whether or not this produces an error code at the userspace level is kind of less important to me than whether or not we actually have a half-completed operation. Does that make sense?
sys/fs/fuse/fuse_io.c
271–281 ↗	(On Diff #81325)	Should this be conditional on `if (err == 0)`? Or maybe not — a write might partially succeed and still produce an error, in which case we'd want to clear SUID?
sys/fs/fuse/fuse_vnops.c
653 ↗	(On Diff #81325)	Isn't this the same as `fsess_not_impl` now?
666–668 ↗	(On Diff #81325)	Do we need to do anything to avoid leaking `infufh` here? I forget how these lifetimes work but see that we don't do any special cleanup at the bottom, either.
670–686 ↗	(On Diff #81325)	There is sort of a nice subroutine for this already: `vn_lock_pair()` in kern/vfs_vnops.c. It does a little more than we want: handles one vnode being NULL; and a little less than we want: doesn't SHARE-lock one of the vnodes, doesn't handle `vp1 == vp2`. But I'd suggest moving the dual-vnode logic we need to a subroutine, possibly also in vfs_vnops.c, and doing a similar thing as far as alternating which lock we start from. It may be usable by other filesystems implementing this VOP.
688–690 ↗	(On Diff #81325)	These assignments and the `io` variable lifetime could be moved inside this condition. (Do we ignore rlimits if `fsizetd == NULL`? I don't fully understand the semantics of this VOP.)
699–701 ↗	(On Diff #81325)	It is pretty odd to me that FUSE chose the read file as sort of the primary subject of this operation. I would have picked the written file. Just an off-hand comment, not actionable.

This revision is now accepted and ready to land.Dec 30 2020, 4:22 PM

asomers marked 6 inline comments as done.Dec 31 2020, 6:07 PM

asomers added inline comments.

sys/fs/fuse/fuse_internal.c
1266–1269 ↗	(On Diff #81325)	The "malicious FUSE server" scenario doesn't work. The kernel already refuses to honor the setuid bit on any file system mounted by a non-root user. So the malicious FUSE server would need to already have root privileges in order for the attack to work. The real problem is that there's no way to indicate an error of "the write succeeded but the permission bits are now wrong". Those two things are supposed to happen automatically. The real solution is to implement `FUSE_WRITE_KILL_PRIV` (and its tweaks in later protocol versions), which instruct the server to kill suid at the same time that it does the write.
sys/fs/fuse/fuse_io.c
271–281 ↗	(On Diff #81325)	Yeah, that's what I was thinking. Kill suid even if the write fails.
sys/fs/fuse/fuse_vnops.c
666–668 ↗	(On Diff #81325)	No. `fuse_filehandle_getrw` doesn't allocate anything, it just looks up an entry in an existing table.
670–686 ↗	(On Diff #81325)	I didn't know about vn_lock_pair. I copied this section from nfs_copy_file_range. However, fusefs's version is not identical to nfs's, because nfs needs to worry about vn_start_write/vn_finish_write, and fusefs does not. I don't think I should make fusefs's a subroutine right now, because there's only one consumer, and I don't know how the exact needs might vary between consumers.
688–690 ↗	(On Diff #81325)	Yes. If `a_fsizetd` is NULL, that indicates that the write is coming from a privileged process, like the NFS server.