Page MenuHomeFreeBSD

cp: use copy_file_range(2)
ClosedPublic

Authored by asomers on Sep 9 2020, 4:07 PM.
Tags
None
Referenced Files
Unknown Object (File)
Dec 4 2024, 7:03 PM
Unknown Object (File)
Dec 2 2024, 11:58 PM
Unknown Object (File)
Nov 16 2024, 7:14 AM
Unknown Object (File)
Nov 5 2024, 12:15 PM
Unknown Object (File)
Oct 25 2024, 4:33 AM
Unknown Object (File)
Oct 25 2024, 4:33 AM
Unknown Object (File)
Oct 25 2024, 4:32 AM
Unknown Object (File)
Oct 25 2024, 4:22 AM

Details

Summary

cp: use copy_file_range(2)

This has three advantages over write(2)/read(2):

  • Fewer context switches and data copies
  • Mostly preserves a file's sparseness
  • On some file systems (currently NFS 4.2) the file system will perform the copy in an especially efficient way.
Test Plan

tested by hand for dense and sparse files. Ran the FreeBSD test
suite (minus ZFS) and saw no regressions.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Looks fine to me.

One concern I had w.r.t. doing this was that, when large
copy_file_range() copies are done, the time it takes for the
copy to complete in the kernel can be significant (which
delays signal handling among other things).
During review of the syscall, kib@ did not think this was
a serious issue, since it was similar to a core dump of
a large application.

However, with the buffer size limited to 2Mbytes, this
should not be a problem.

Thanks for doing this. I had it on my todo someday list.

This revision is now accepted and ready to land.Sep 10 2020, 1:47 AM

Yeah, doing copies 2MB at a time shouldn't be too slow. However, it does cause a different problem, I think. When I used this to copy a 16 GB sparse file (512B used) on UFS, the output file had 32MB used. I'm guessing that's due to UFS indirect blocks, and that increasing the copy_file_range bufsize would reduce the number of indirect blocks created. Alternatively (and probably better) would be for cp to use SEEK_HOLE/SEEK_DATA to completely skip the holes. But this version is still a lot better than the previous one.

Next on my list are install and dd.

This revision was automatically updated to reflect the committed changes.

One concern I had w.r.t. doing this was that, when large
copy_file_range() copies are done, the time it takes for the
copy to complete in the kernel can be significant (which
delays signal handling among other things).
During review of the syscall, kib@ did not think this was
a serious issue, since it was similar to a core dump of
a large application.

This reads like an early discussion of a security issue.

However, with the buffer size limited to 2Mbytes, this
should not be a problem.

Except D27937 removed this. If the kernel bug remains, anything FreeBSD 13 or later is vulnerable, and tools shipped with 14 will break 14.

One concern I had w.r.t. doing this was that, when large
copy_file_range() copies are done, the time it takes for the
copy to complete in the kernel can be significant (which
delays signal handling among other things).
During review of the syscall, kib@ did not think this was
a serious issue, since it was similar to a core dump of
a large application.

This reads like an early discussion of a security issue.

Local DOSes are not considered security issues. And NFS mounts are sort of quasi-local from a security perspective, because the client must be trusted.

However, with the buffer size limited to 2Mbytes, this
should not be a problem.

Except D27937 removed this. If the kernel bug remains, anything FreeBSD 13 or later is vulnerable, and tools shipped with 14 will break 14.

https://reviews.freebsd.org/D26620 fixed the interruptibility issue in the kernel, for all callers.