Page MenuHomeFreeBSD

cp: fix operation on non-bleeding edge
AbandonedPublic

Authored by ehem_freebsd_m5p.com on Feb 27 2021, 1:12 AM.
Tags
None
Referenced Files
Unknown Object (File)
Mar 11 2024, 5:04 PM
Unknown Object (File)
Dec 25 2023, 8:18 PM
Unknown Object (File)
Dec 23 2023, 12:11 AM
Unknown Object (File)
Dec 20 2023, 1:29 AM
Unknown Object (File)
Oct 16 2023, 10:07 PM
Unknown Object (File)
Aug 14 2023, 11:08 AM
Unknown Object (File)
Aug 10 2023, 6:28 PM
Unknown Object (File)
Jul 9 2023, 8:08 PM
Subscribers

Details

Reviewers
asomers
Summary

copy_file_range() is only available on *very* recent versions of
FreeBSD. As such we need to handle ENOSYS indicating the system call
is unavailable.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 37646
Build 34535: arc lint + arc unit

Event Timeline

Problem is D26395 only handles EINVAL which /dev/null may trigger. Yet if one is building on a slightly non-recent version ENOSYS can occur. Right now I believe it is impossible to build HEAD on anything other than HEAD or potentially 14. This feels like a semi-urgent issue.

Now to continue fighting arc...

asomers requested changes to this revision.Feb 27 2021, 1:21 AM

Why would you be running a version of cp from a newer world with an older kernel? Running newer binaries on older kernels has never been supported.

This revision now requires changes to proceed.Feb 27 2021, 1:21 AM

But that doesn't answer the question: why would you be running a newer cp on an older kernel?

If one runs say release/release.sh -c <config>.conf it first builds a chroot and runs the rebuilt executables inside. The complete lack of compatibility breaks all builds. I suspect bug #249260 may actually be attributable to this.

I guess I should add, consistent failure during two build attempts:

===> kerberos5/lib/libasn1 (obj,all,install)
cp: rfc2459_asn1-priv.hx: Function not implemented
--- rfc2459_asn1-priv.h ---
*** [rfc2459_asn1-priv.h] Error code 1

Current run isn't yet complete, but I suspect this time it may finally complete (then I can try my actual goal!). Making builds impossible seems rather problematic.

Current build hasn't yet completed, but it looks suspiciously like this was the last issue. Build tools are the exception where effort should be put into making them function where other utilities aren't bothered with. Particularly here where it is a tiny fix.

When looking at this I'm wondering whether the errno values should be a blacklist instead of a whitelist. Allowing EXDEV could be argued for. Failing to fall back in situations where read()/write() would work is a problem, trying read()/write() and having them fail is only harmful to performance.

Current build hasn't yet completed, but it looks suspiciously like this was the last issue. Build tools are the exception where effort should be put into making them function where other utilities aren't bothered with. Particularly here where it is a tiny fix.

I think this is a good fix. The whole build / bootstrap / recovery part of the system needs to be somewhat more resilient across the last couple of versions. Think to yourself "can the system survive well enough to rebuild itself if something went wrong during a make installworld" followed quickly by "if someone did an installworld installkernel then reboot and had to back out the new kernel, could they build a replacement with the old kernel and new userland?"

When the answer to these questions has been 'no' on the past (dating back 20 years to 4.x), we've put in 'forward compatibility' code to ensure people didn't have to find old binaries to dig out... Not for every utility, and for limited duration (a major release or two sometimes), but for ones you need to dig out with.

Not sure about white vs black list of error numbers. But I do wonder how ENOSYS is returned when it's not in the kernel. Normally new system calls generate SIGSYS which terminates the program. How is it that this one is different?

bin/cp/utils.c
248

Normally ENOSYS is SIGSYS and a terminated program. How is that avoided here?

Why would you be running a version of cp from a newer world with an older kernel? Running newer binaries on older kernels has never been supported.

No. This is incorrect. For critical tools like cp and other things needed to build the system, it has always been supported. There are quite a number of 'forward compatibility' hacks that we've put in over the years (and later removed) in a number of tools or libraries to allow exactly this sort of thing.

While it is true in general, it's not true of the bootstrap path because all too often people will need to fall back to an old kernel after doing an installworld (either by doing it before testing the new kernel, or by discovering some critical issue after the installworld finishes or sometimes because the installworld dies half way through). cp is one of the programs that, at least in the past, has been in this path....

In D28959#648411, @imp wrote:

Not sure about white vs black list of error numbers. But I do wonder how ENOSYS is returned when it's not in the kernel. Normally new system calls generate SIGSYS which terminates the program. How is it that this one is different?

That is a rather curious observation. According to all documentation I could find, the default for SIGSYS is indeed core dump. Perhaps the behavior of sending SIGSYS was changed or the behavior is different depending upon the type of unimplemented system call? SIGSYS seems appropriate in cases like syscall 19 which shouldn't ever occur (something is very wrong if this happens); whereas the more gentle ENOSYS is more appropriate for system calls >= SYS_MAXSYSCALL since they're likely to be handled in a fashion similar to what this patch proposes.

This patch covers the minimal case needed for current compatibility. I'm wondering if perhaps something like:

/* mention EINVAL, ENOSYS here */
if (rcount < 0)
        use_copy_file_range = 0;

would be better. Trying read()/write() if we get ENOMEM or ENOSPC will fail, but is it worth the extra code to check for the known good or bad values?

In D28959#648411, @imp wrote:

Not sure about white vs black list of error numbers. But I do wonder how ENOSYS is returned when it's not in the kernel. Normally new system calls generate SIGSYS which terminates the program. How is it that this one is different?

That is a rather curious observation. According to all documentation I could find, the default for SIGSYS is indeed core dump. Perhaps the behavior of sending SIGSYS was changed or the behavior is different depending upon the type of unimplemented system call? SIGSYS seems appropriate in cases like syscall 19 which shouldn't ever occur (something is very wrong if this happens); whereas the more gentle ENOSYS is more appropriate for system calls >= SYS_MAXSYSCALL since they're likely to be handled in a fashion similar to what this patch proposes.

This patch covers the minimal case needed for current compatibility. I'm wondering if perhaps something like:

/* mention EINVAL, ENOSYS here */
if (rcount < 0)
        use_copy_file_range = 0;

would be better. Trying read()/write() if we get ENOMEM or ENOSPC will fail, but is it worth the extra code to check for the known good or bad values?

We've used the kernel version in the past to know what to call. Maybe we can do the same thing here and default to 'no' for versions that are too old. that would also work around the problem. The other way is to have a sigsys handler for cp that just sets user_copy_file_range = 0 and returns, which would cause the ENOSYS to be returned and the fallback would just happen w/o changing the above code...

In D28959#648959, @imp wrote:

We've used the kernel version in the past to know what to call. Maybe we can do the same thing here and default to 'no' for versions that are too old. that would also work around the problem. The other way is to have a sigsys handler for cp that just sets user_copy_file_range = 0 and returns, which would cause the ENOSYS to be returned and the fallback would just happen w/o changing the above code...

My statement was that was curious since this is tested and confirmed to work on FreeBSD 12.2-p4. Did the behavior of sending SIGSYS get broken? Did someone make a deliberate choice not to send SIGSYS for calls >= SYS_MAXSYSCALL, since this is the most likely situation and not sending SIGSYS makes things simpler?

Doing nothing more gets us back to 12.2-p4. Questions are: How far back does this (mis?) behavior go? How far back is compatibility with userspace cp worthwhile?

Is the following acceptable to @asomers :

/* mention EINVAL, ENOSYS here */
if (rcount < 0)
        use_copy_file_range = 0;

I'm fine with using this instead.

bin/cp/utils.c
248

In case I haven't gotten overt enough, I ran into this trying to build HEAD on a 12.2-p4 system. This produced the message "Not Implemented" => ENOSYS which was attributable to the unavailable copy_file_range() system call. I've no idea why it returns ENOSYS instead of sending SIGSYS, merely that it does. As such the provided patch works.

We probably don't want to try the fallback path for cases like ENOSPC and EACCES. I think the original code, explicitly checking for ENOSYS and EINVAL, is better. BTW, I successfully built 13.0-BETA3 on 12.2-RELEASE yesterday. However, I did not use release.sh just buildworld and buildkernel. Is that your experience too, that release.sh is necessary to trigger the bug?

We probably don't want to try the fallback path for cases like ENOSPC and EACCES. I think the original code, explicitly checking for ENOSYS and EINVAL, is better. BTW, I successfully built 13.0-BETA3 on 12.2-RELEASE yesterday. However, I did not use release.sh just buildworld and buildkernel. Is that your experience too, that release.sh is necessary to trigger the bug?

Possibly, portions of release.sh run cp in a chroot which results in running a target build of cp on the build system kernel.

I agree the ideal is not to try read()/write() if copy_file_range() has given ENOSPC, EACCES, EFBIG, or EDQUOT. I'm asking whether it is actually worthwhile to try to whitelist or blacklist certain errno values. If EINVAL, ENOSYS, or other error values for which retrying is the correct approach are more common, then the above approach is actually better. Further if the original code had never tried to test errno, then there would never even temporarily have been a bug.

There's no way that EINVAL and ENOSYS are more common than errors like ENOSPC, EROFS, and EACCES. The latter three can happen in regular production environments, but the first two can only happen when building new releases like you found, or after an upgrade gone wrong.

Quite likely, but it needed consideration. So, where does this stand?

I think the current version is better than always using the fallback. Just fix the lint issue and I'll accept it.

bin/cp/utils.c
248

Out linter will complain about this unless you add a /* FALLTHROUGH */ above the second case statement.

Oh, that sort of situation. One other perversion^Widea came to mind:

switch (errno) {
/* Prob a non-seekable FD */
case EINVAL:
/* Earlier OS version */
	/* fallthrough */
case ENOSYS:
/* different devices */
	/* fallthrough */
case EXDEV:
	use_copy_file_range = 0;
	break;

case ENOSPC:
	/* fallthrough */
case EDQUOT:
	/* fallthrough */
case EFBIG:
	/* fallthrough */
case EROFS:
	/* fallthrough */
case EINTR:
	/* fallthrough */
case EBUSY:
	/* fallthrough */
case ENOBUFS:
	/* fallthrough */
case ENOMEM:
	/* fallthrough */
case EINTEGRITY:
	/* fallthrough */
case EIO:
	/* fallthrough */
case EBADF:
	/* fallthrough */
case EISDIR:
	/* fallthrough */
	break;

/* fallback for safety */
default:
	use_copy_file_range = 0;
}

That is likely going completely overboard in the wrong direction. The advantage is this is a bit more future-proof, potentially copy_file_range() could sprout more errors and this classifies all known ones and defaults to the safe strategy.

Cover the fallthrough comments. Ponder this approach since it fails
better in case of new return code from copy_file_range().

I like this better since it will try read()/write() in case of copy_file_range() sprouting new error return codes. Problem is that is a a lot of lines since the list of known terminating return codes is large. This may be a no win/pick your poison case.

ehem_freebsd_m5p.com marked an inline comment as done and an inline comment as not done.Mar 7 2021, 12:51 AM

Errf, selecting the wrong one to mark done. ENOSYS generating SIGSYS is still a mystery (doesn't cause failures, but this sort of mystery is concerning).

Presently I'm waiting for word on D28959. I dislike my proposal since it is a bunch of lines, but I dislike bluelisting the few valid errors. I strongly prefer a redlist of the ones for which copy_fallback() should not be tried.

Not quite 2 years since the last word on this. The whole value to me was making it possible to build CURRENT on a RELEASE host. FreeBSD-13.1 has been released, so the value has been lost.