Page MenuHomeFreeBSD

tests/ci: Add reproducibility tests
Needs ReviewPublic

Authored by bofh on Tue, Oct 14, 2:48 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Nov 3, 10:08 PM
Unknown Object (File)
Wed, Oct 29, 7:24 AM
Unknown Object (File)
Wed, Oct 29, 4:23 AM
Unknown Object (File)
Wed, Oct 29, 4:19 AM
Unknown Object (File)
Wed, Oct 29, 4:13 AM
Unknown Object (File)
Wed, Oct 29, 4:10 AM
Unknown Object (File)
Wed, Oct 29, 4:10 AM
Unknown Object (File)
Wed, Oct 29, 4:10 AM

Details

Reviewers
emaste
lwhsu
markj
Summary

This patch adds reproducibility tests through our pre-commit scripts. Two different builds are done with the following differences:

  • Primary src path is changed
  • LANG differs from default and et_EE.UTF-8
  • LC_ALL differs from default and et_EE.UTF-8
  • TZ differs from default and /usr/share/zoneinfo/Etc/GMT-14
  • date differs from now and 1Year 1Month 1Day 6Hours 23Minutes ahead
  • First build is done with -j1 while the second one with -j${parallelism} meaning all possible cpus

And then at the last diffoscope creates a report between two different OBJDIRPREFIX

Test Plan
# cd /usr/src/tests/ci
# make ci CITYPE=reproducibility

meta should contain an html file

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

bofh requested review of this revision.Tue, Oct 14, 2:48 PM
bofh edited the summary of this revision. (Show Details)

To my understanding that having those in tests/ci/tools/freebsdci means running in the testvm, right? If so it means the test environment, i.e., the provisioned VM for running full or smoke tests. I feel it's not necessary to do so and this can be done in the build environment, i.e., the env you run the buildworld and buildkernel commands, and building the VM images. That usually means more efficient and we don't really need to do full build in an env to be tested. We just need to build things (for test) in a verified env.

To my understanding that having those in tests/ci/tools/freebsdci means running in the testvm, right? If so it means the test environment, i.e., the provisioned VM for running full or smoke tests. I feel it's not necessary to do so and this can be done in the build environment, i.e., the env you run the buildworld and buildkernel commands, and building the VM images. That usually means more efficient and we don't really need to do full build in an env to be tested. We just need to build things (for test) in a verified env.

No. This is not possible. I have checked and replenished all possible way for that unless you have an entire better idea for this. Everything can be tested except the date change. Changing the date on the host itself might have catastrophic effect on our build servers and build environment. Which I am not willing to do. And after lots of test on Jenkins and scripts I have come to this conclusion. I am open to ideas but not what you have recommended. Do you remember our discussion that why SOURCE_DATE_EPOCH cannot help us really test the reproducibility on two different dates?

As listed here: https://tests.reproducible-builds.org/freebsd/freebsd.html, it seems that there are a lot of other parameters that can also be changed to get better coverage. For example, can we also change these parameters? :

  • (trivial) UID/GID of the building user
  • (trivial) hostname
  • (maybe in the future) mounting a disorderfs path and building there?

As listed here: https://tests.reproducible-builds.org/freebsd/freebsd.html, it seems that there are a lot of other parameters that can also be changed to get better coverage. For example, can we also change these parameters? :

  • (trivial) UID/GID of the building user

I think since NO_ROOT is going to be default this has no more effect on testing. I have already discussed this with @emaste

  • (trivial) hostname

hostname is important only in ports build I believe. But yes that can be added later or here too

  • (maybe in the future) mounting a disorderfs path and building there?

As we are building inside a vm we have a plethora of options to explore into. But I want to land this first.

I think since NO_ROOT is going to be default this has no more effect on testing. I have already discussed this with @emaste

We should still verify though, root vs non-root should not have an effect but it's possible that "built by <user>" type strings could get included.

hostname is important only in ports build I believe. But yes that can be added later or here too

Same reason as above

As we are building inside a vm we have a plethora of options to explore into. But I want to land this first.

Yes, we can start with something and extend it over time.

Can you regenerate the patches with git show -U999999 or git diff -U999999 to include context?

  • Add random hostname for second build
  • Use tests user as the second build user

To my understanding that having those in tests/ci/tools/freebsdci means running in the testvm, right? If so it means the test environment, i.e., the provisioned VM for running full or smoke tests. I feel it's not necessary to do so and this can be done in the build environment, i.e., the env you run the buildworld and buildkernel commands, and building the VM images. That usually means more efficient and we don't really need to do full build in an env to be tested. We just need to build things (for test) in a verified env.

No. This is not possible. I have checked and replenished all possible way for that unless you have an entire better idea for this. Everything can be tested except the date change. Changing the date on the host itself might have catastrophic effect on our build servers and build environment. Which I am not willing to do. And after lots of test on Jenkins and scripts I have come to this conclusion. I am open to ideas but not what you have recommended. Do you remember our discussion that why SOURCE_DATE_EPOCH cannot help us really test the reproducibility on two different dates?

Oh I should have explained this more clear. First, I don't object to have this landed first as this is a good start. I thought it was used to discuss it more.

The test env here means our testvm which is a snapshot build result, which is not verified yet so this could not be a very good base. In the ideal world it needs us to build head in a release, and a head in head to be complete. Because when an unreproducible issue found, it can be in the build env or the code, or the worse case, the both updated build env and code cover the error, which means a false negative. Another possible way is having some simple reproducible test cases in the test suits (/usr/tests) to verify the build env first, then we can be sure that env can be used to verify the source (/usr/src) is reproducible.

Anyway this could be too complex and I don't think we have to do that now. What I was think is having few targets added, e.g. (with very bad naming here) build-j1, build-mutated, check-reproducible, etc. And people can run in their build env as needed. We can put a warning or check a special definition to ensure that people know their env will be modified. Thus people can run those targets in their disposable env (e.g., their own VM) without being forced to use our specified VM. And in our CI we can select preferred build env, it can be a testvm we have built and verified previously, or just a -RELEASE VM (with cloud-init or nuageinit if it supports run any command). We can copy/mount the src to the VM in various ways, and even can just put it to another disk and attach to that VM as the second disk. (Speaking of this, does our testvm have enough space for hosting two sets of obj files?) Then run the needed targets in that VM. We can even run the normal build outside, and only run the mutated build in a VM, and verify the result again not in VM. Just need to do more artifacts copying from/to VM to do verification. Again this is only for discussing for now but not blocking the current status.

BTW IIRC what we have discussed is devel/libfaketime is broken and cannot be used for this test (we also haven't checked it can be used even it's good). And yeah SOURCE_DATE_EPOCH is not for testing but a fix for the unreproducible stuff. Well, if jail supports having its own time namespace perhaps it's a good way to do this kind of tests.