Page MenuHomeFreeBSD

Allow reproducible kernel builds by removing build metadata
ClosedPublic

Authored by emaste on Dec 2 2015, 3:26 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Dec 20, 12:01 AM
Unknown Object (File)
Thu, Dec 19, 11:38 PM
Unknown Object (File)
Sun, Dec 8, 6:22 AM
Unknown Object (File)
Sat, Dec 7, 6:01 AM
Unknown Object (File)
Wed, Dec 4, 8:57 AM
Unknown Object (File)
Tue, Dec 3, 10:28 PM
Unknown Object (File)
Mon, Dec 2, 9:06 PM
Unknown Object (File)
Fri, Nov 29, 7:22 PM
Subscribers

Details

Summary

Set WITH_KERNEL_METADATA=yes in src.conf to include the time, build user and host, and path in the kern.version sysctl and uname.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

emaste retitled this revision from to newvers: Add reproducible build command line option.
emaste updated this object.
emaste edited the test plan for this revision. (Show Details)
emaste added reviewers: bapt, imp.
emaste retitled this revision from newvers: Add reproducible build command line option to Default to reproducible kernel builds by removing build metadata.
emaste updated this object.

Set the default to exclude the build metadata and add a src.conf(5) option to enable.

UPDATING entry TBD.

kib added a reviewer: kib.
This revision is now accepted and ready to land.Dec 2 2015, 4:23 PM

Technically it seems fine. I don't like the style of 'if [ $INCLUDE_METADATA ]', but it works so it is fine.

I only have concerns about the change itself. Not so much an opinion on right or wrong, but if we're going to default to removing the user@host, maybe we should just remove it entirely rather than add more options/complexity.

You should check that this doesn't break crashinfo's find_kernel() function. Removing the path _will_ break kgdb -n X which people actively use. (It pulls the compile path out of the info.X file from the dump and uses that to find the build directory and corresponding kernel build.) For release builds the user should always be 'root', and the path should always be '/usr/obj/usr/src/sys/GENERIC', so the only one that is actually variable is the hostname. The kgdb / crashinfo bits only need the path (though they probably have to be updated to handle the different format if you were to drop the user and host and leave the path).

Don't like the default being no.
In both development and deployment environments, knowing when the kernel was built is extremely useful to know if you are chasing down that bug you fixed last night around 5. You can't get that information easily knowing it is build 44. Was that the build before it or after? In many different scenarios, ranging from my own stupid luck on a single server, to multi-system QA environments, the actual build time has proven quite useful.

I understand that it has little meaning when you are looking at running a stock kernel. But that's a very controlled environment where you can set things appropriately.

Also, why not have a more generic MK_REPEATABLE_BUILD that would set all the sub-options needed to get a repeatable build to yes / no as needed?

sys/conf/kern.opts.mk
49 ↗(On Diff #10671)

Why default no? Shouldn't this be default yes?

sys/conf/newvers.sh
249

Elsewhere in the tree this is spelled either

if [ ! -z $INCLUDE_METADATA ]; then

or

if $INCLUDE_METADATA; then # with INCLUDE_METADATA set to true or false

sys/conf/kern.opts.mk
49 ↗(On Diff #10671)

My hope is that we default to a reproducible build.

sys/conf/newvers.sh
249

! -z ought to be -n (and the variable needs to be quoted)

sys/conf/kern.opts.mk
49 ↗(On Diff #10671)

I don't think that's a reasonable default for the kernel. It only makes sense to have it default to the reproducible build for pre-packaged system. Eliminating the metadata (including the build number) would severely impact the debuggability of the system. When you've modified the tree, the metadata becomes critical to retain. As such, I think it only makes sense to have it default for release builds. I've relied on this data a lot in the past, and don't think it is reasonable to make it 'opt in' because by the time you want to ask the question about the kernel, it's too late to go back and rebuild to get it. Once the 'M' modifier is appended to the svn version, you need this extra data. Without the M modifier, you're right, it doesn't matter: you can get the data from what's presented in this patch.

I'd suggest -M which is 'reproducible if $SCM indicates pure tree, otherwise include what's included today'. When the tree is pure, the who / where built is of only very limited use. You can get all the traceability info you need from the $SCM version / hash. When the tree is modified, however, you lose all that information that's external to the tree. You can't tell kernels apart easily and that's often critically important.

If we don't want to have a lot of knobs, just implement this.

sys/conf/newvers.sh
68

You could have m and M. M would be 'reproducible unless the tree is modified.'.

183

If svn ended with M you could set a modified flag here.

249

You could test here for -m (unconditional reproducible) or -M (reproducible if not modified). That would address my concern about traceability of kernels.

Or to make it simpler you could do what I'm describing as -M always. That would be a reasonable default.

emaste retitled this revision from Default to reproducible kernel builds by removing build metadata to Allow reproducible kernel builds by removing build metadata.
emaste edited edge metadata.
  • exclude build knobs for now (just add support in newvers.sh)
  • add mode to choose to include metadata or not based on repo modified/dirty status
  • use -r to enable reproducible builds, as NetBSD did
This revision now requires review to proceed.Jun 2 2016, 5:38 PM
sys/conf/newvers.sh
237

I'm not sure how to determine if a hg repo is clean, but this can be added later

I think crashinfo will DTRT, but it would be good to test that explicitly if you haven't already. It would be nice if 'kgdb -n last' also works still, but requesting users to explicitly specify the path to the kernel when requesting followup info for a bug report isn't the end of the world if it doesn't work.

In D4347#141160, @jhb wrote:

I think crashinfo will DTRT, but it would be good to test that explicitly if you haven't already. It would be nice if 'kgdb -n last' also works still, but requesting users to explicitly specify the path to the kernel when requesting followup info for a bug report isn't the end of the world if it doesn't work.

I still hope to make crashinfo fully reliable by introducing build-id later on, but will investigate the situation based on the current path further.

This revision was automatically updated to reflect the committed changes.