Page MenuHomeFreeBSD

Add support for -c option to sha256sum et.al.
ClosedPublic

Authored by se on Jun 18 2021, 9:22 AM.

Details

Summary

The checksum programs have been extended to emulate GNU behavior when invoked with the name ending in "sum".
Some ports do fail now (e.g. print/qpdf) since they try to use "sha256sum -c <chksumsfile>" and abort the configure phase if that command fails.
The sha256sum fails with error status due to "-c" not being implemented in GNU emulation mode.

The attached patch adds support for checksum verification as performed by the GNU sha256sum program (and of course also for the other hash algorithms supported).

Test Plan

Apply patch and build sha256sum.
Verify that there are no regressions.
Test the new feature e.g. by the executing "make configure" for the print/qpdf port (which is known to fail without the patch).

Diff Detail

Repository
rG FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

se requested review of this revision.Jun 18 2021, 9:22 AM
se created this revision.

Note that the GNU coreutils sha256sum actually has support for reading BSD sha256(1) formatted lines, e.g.:

$ cat CHECKSUM.SHA256-FreeBSD-13.0-RELEASE-amd64
SHA256 (FreeBSD-13.0-RELEASE-amd64-bootonly.iso) = c81a911f9d5fc7404877dd679771d776e1447cc38b31e1c07042d2620e49d4ac
SHA256 (FreeBSD-13.0-RELEASE-amd64-bootonly.iso.xz) = f89fa42b3d93cf5c380b2726a63500b6106b54bd020ddc0c125b76b141e026f2
SHA256 (FreeBSD-13.0-RELEASE-amd64-disc1.iso) = f78d4e5f53605592863852b39fa31a12f15893dc48cacecd875f2da72fa67ae5
SHA256 (FreeBSD-13.0-RELEASE-amd64-disc1.iso.xz) = edf45ba6fad6a6aabc56623562a419096f4aaf78473ac8e96d2870cf27816195
SHA256 (FreeBSD-13.0-RELEASE-amd64-dvd1.iso) = d3df1818c0b90ae8d4c88c447dd158c3c3a3ddada4171ac7b0fe55baa040c821
SHA256 (FreeBSD-13.0-RELEASE-amd64-dvd1.iso.xz) = 036ab9d2a96140e953fe6bcb57546567965c8ba05ca92a7e3c3f9eb8e222bd74
SHA256 (FreeBSD-13.0-RELEASE-amd64-memstick.img) = 3a1b0ef1e2211f03980eb00fdeedeb3cd9ead03f1bfcd9f6a1eb335c3b994377
SHA256 (FreeBSD-13.0-RELEASE-amd64-memstick.img.xz) = 7589cbc83b737da6a73c48ff250525b3eaec99522af9b878895895333ae4bad0
SHA256 (FreeBSD-13.0-RELEASE-amd64-mini-memstick.img) = 107ba7f8b07f60e92fe75f86690d17ebdf9f5b5b55b68e22ca1a51e80f19349d
SHA256 (FreeBSD-13.0-RELEASE-amd64-mini-memstick.img.xz) = f5cc2a37b5061961fb741acf4c633f303565153601da3d7fefefb1f150b13726

$ sha256sum --version
sha256sum (GNU coreutils) 8.30
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Ulrich Drepper, Scott Miller, and David Madore.

$ sha256sum -c --ignore-missing CHECKSUM.SHA256-FreeBSD-13.0-RELEASE-amd64
FreeBSD-13.0-RELEASE-amd64-bootonly.iso.xz: OK

but maybe it's a bit overkill to also support this...

In D30812#692927, @dim wrote:

Note that the GNU coreutils sha256sum actually has support for reading BSD sha256(1) formatted lines, e.g.:

$ cat CHECKSUM.SHA256-FreeBSD-13.0-RELEASE-amd64
SHA256 (FreeBSD-13.0-RELEASE-amd64-bootonly.iso) = c81a911f9d5fc7404877dd679771d776e1447cc38b31e1c07042d2620e49d4ac
SHA256 (FreeBSD-13.0-RELEASE-amd64-bootonly.iso.xz) = f89fa42b3d93cf5c380b2726a63500b6106b54bd020ddc0c125b76b141e026f2
[...]
$ sha256sum -c --ignore-missing CHECKSUM.SHA256-FreeBSD-13.0-RELEASE-amd64
FreeBSD-13.0-RELEASE-amd64-bootonly.iso.xz: OK

but maybe it's a bit overkill to also support this...

It wouldn't be much effort to add parsing of this format.
But for full compatibility we'd need to add --ignore-missing, too, and that implies long option support ...

I'd like to get the version proposed in this review committed, first, since that will fix currently broken ports.

If further compatibility extensions are found to be useful, I'd be willing to implement them in a second step.
I could also add test cases for the different hash algorithms and usage modes.

BTW: I do not expect more than one -c <CHKSUMFILE> option per invocation.
It is easy to extend this to multiple file lists by not initializing "head" in the gnu_check() function, but I do not know how the coreutils version behaves ...

se retitled this revision from Add support for -c option to sha25sum et.al. to Add support for -c option to sha256sum et.al..Jun 18 2021, 1:59 PM

Also, could you you use arc to upload this, or do a diff -U999999 so we can see more of the context?

sbin/md5/md5.c
172

Can you replace this with some #define, or at the very least do the math for why 256 is the right length for 1024 bit hash?

173

USE MAXPATHLEN + 1 then.

182

Does this work with filenames / paths with spaces in them?

Update according to comments received:

  • Fix handling of filenames with blanks
  • Use macros for constants
  • Full context is included in the uploaded patch.

While fixing the checksum file parsing to allow for blanks in names, I have also added support for BSD style files (e.g. distinfo files used in the ports system).
The output has been aligned to that of the coreutils version (compared identical in a number of tests with valid and invalid checksum files).

Additional changes:

  • More cases of malformed patterns are detected.
  • The usage information has been updated to account for the *sum variants.

If deemed useful, I could create Kyua test cases (e.g. based on the digests embedded in the program, plus additional test cases including error cases).

I'm not convinced that the embedded digests are still useful in that case.
But they might be required for easy benchmarking of the different hash functions?

Small change of the parsing of coreutils style digest files:

The coreutils digests contain either " " (2 blanks) or " *" (blank followed by asterisk, if created with the -b option).
Our -r option put only a single blank as separator between digest and file name.
This patch allows for 1 blank, 2 blanks, or blank plus asterisk as separator, and it makes us generate 2 blank separators to be more compatible with the coreutils.

A man-page update is included in this patch as well. I have already (somewhat prematurely) pushed a part of this change to add the -sum forms to the synopsis.
This update of the man-page describes the behavior of either case.
I have opted for separate sections for the two cases instead of "-c string | file" for clarity.

This looks OK to me, but I'd feel better if you waited a little to see if other reviewers pipe up.

This revision is now accepted and ready to land.Jun 19 2021, 3:05 PM

The previous version had lost a few features of the coreutils compatibel program versions (ending in "sum").
These are restored in this updated patch.

This revision now requires review to proceed.Jun 19 2021, 3:48 PM
sbin/md5/md5.c
172

Can you replace this with some #define, or at the very least do the math for why 256 is the right length for 1024 bit hash?

There already was HEX_DIGIT_LENGTH and I do use it now.

173

Code changed but does not use MAXPATHLEN in the definition of the longest expected input line.

se marked 2 inline comments as done.

Add test cases to detect output format regressions.

While the built-in test function tests the algorithms, there are several options that change the output format of the generated digests.
These tests were used to verify that the changes made for better emulation of the coreutils versions (-c file functionality) did not change the output format.

Fix tests/Makefile - I had not noticed that some files were not being installed since they already existed in the destination directory from prior testing.

I want to commit this version if there are no further review comments within the next 24 hours.

This revision is now accepted and ready to land.Jun 24 2021, 4:13 PM