While building libm, turn off any math-related compiler builtins
ClosedPublic
Actions

Authored by dim on Feb 10 2021, 7:27 PM.

Details

Reviewers

emaste
arichardson
alfredo
se
jrtc27
ngie

Commits

rG7ee9798bad0b: Build lib/msun tests with compiler builtins disabled
rGe91ff79b2811: Build lib/msun tests with compiler builtins disabled
rG27e68baf24f1: Build lib/msun tests with compiler builtins disabled
rGcf97d2a1dab8: Build lib/msun tests with compiler builtins disabled

Summary

This is an idea I got when debugging libm test cases, and finding that
the compiler had neatly inlined pow(), sin() and various other calls.
Obviously for general code this behavior is just fine, but not when
either building libm itself, or its test cases!

Of course the easy way out is to use -fno-builtin, turning off the
compiler's builtins completely. But on the other hand, it would be
perfectly fine to have it e.g. inline or optimize non-math related calls
like memcpy(), both while building libm itself, or its tests.

Therefore I made a list of math functions that are announed in math.h
and complex.h, and added a separate math-builtins.mk which disables each
of those indidivually, using -fno-builtin-xxx.

NOTE: neither clang nor gcc complain about builtins they don't know, so there is no need for cumbersome trimming of the list and adjusting to compiler types or versions. We can simply list all the functions implemented in libm.

Test Plan

I ran this through the complete "make check" with atf and kyua, and I
got no errors.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

dim created this revision.Feb 10 2021, 7:27 PM

Herald added a subscriber: imp. · View Herald TranscriptFeb 10 2021, 7:27 PM

dim requested review of this revision.Feb 10 2021, 7:27 PM

Harbormaster completed remote builds in B36872: Diff 83659.Feb 10 2021, 7:27 PM

Oops, fix comments.

Harbormaster completed remote builds in B36873: Diff 83660.Feb 10 2021, 7:29 PM

I wonder if we should just use -fno-builtin (or even -ffreestanding) and explicitly use __builtin_memcpy etc?

Doing so for tests makes sense, but why libm itself?

assuming we go with this and not general -fno-builtin can we come up with some way to make sure they stay up-to-date?

In D28577#639774, @arichardson wrote:

I wonder if we should just use -fno-builtin (or even -ffreestanding) and explicitly use __builtin_memcpy etc?

I originally wanted to avoid having to go over all the sources and hunt down all such calls, but in fact there shouldn't really be many. This library tends to be very self-contained.

In D28577#639776, @jrtc27 wrote:

Doing so for tests makes sense, but why libm itself?

This is because libm has many functions which are implemented in terms of other math functions. For instance, many functions which handle complex arguments are implemented in terms of calling the non-complex variants. E.g. csinh() calls sinh(), cosh(), sin() and cos(). In all these cases, unexpected outcomes could occur if the compiler deciders to shortcut the logic by inserting its own builtins at those call sites.

In D28577#639798, @emaste wrote:

assuming we go with this and not general -fno-builtin can we come up with some way to make sure they stay up-to-date?

Yeah this is bit of a burden. Now that I'm thinking about this again, it might be better to simply use the generic -fno-builtin and if needed, identify any non-math related libcalls that *could* be inlined without causing trouble.

(Btw, note that this same reasoning also holds for e.g. libc, which we also don't compile with -fno-builtin.)

In D28577#639776, @jrtc27 wrote:

Doing so for tests makes sense, but why libm itself?

This is because libm has many functions which are implemented in terms of other math functions. For instance, many functions which handle complex arguments are implemented in terms of calling the non-complex variants. E.g. csinh() calls sinh(), cosh(), sin() and cos(). In all these cases, unexpected outcomes could occur if the compiler deciders to shortcut the logic by inserting its own builtins at those call sites.

But that's legal, and if the function has different semantics than C says it should then you shouldn't be calling it that (and how is it ok to use that in the system's standard library implementation)? This feels to me like fixing the wrong problem.

(Btw, note that this same reasoning also holds for e.g. libc, which we also don't compile with -fno-builtin.)

Yes and I don't want perfect to be the enemy of good, it is just something for us to consider.

In D28577#639808, @jrtc27 wrote:

In D28577#639776, @jrtc27 wrote:

Doing so for tests makes sense, but why libm itself?

This is because libm has many functions which are implemented in terms of other math functions. For instance, many functions which handle complex arguments are implemented in terms of calling the non-complex variants. E.g. csinh() calls sinh(), cosh(), sin() and cos(). In all these cases, unexpected outcomes could occur if the compiler deciders to shortcut the logic by inserting its own builtins at those call sites.

But that's legal, and if the function has different semantics than C says it should then you shouldn't be calling it that (and how is it ok to use that in the system's standard library implementation)? This feels to me like fixing the wrong problem.

Yes, you are certainly right from a "legal" POV, but the way libm/msun works is very dependent on the actual implementation. We've seen some instances where at least clang decided to take some shortcuts, and this gave rise to failed test cases. The test cases themselves also suffered from this, as some of the cases were never really run at run time, as the compiler had simply inlined all the results. :)

Another type of solution would be to prefix all msun functions with a unique prefix, say msun_, and only call msun_sin, msun_cos, etc, to get the expected behavior. Then for the external users, define weak symbols to match the standard C entrypoints like sin, cos, etc.

In D28577#639808, @jrtc27 wrote:

In D28577#639776, @jrtc27 wrote:

Doing so for tests makes sense, but why libm itself?

This is because libm has many functions which are implemented in terms of other math functions. For instance, many functions which handle complex arguments are implemented in terms of calling the non-complex variants. E.g. csinh() calls sinh(), cosh(), sin() and cos(). In all these cases, unexpected outcomes could occur if the compiler deciders to shortcut the logic by inserting its own builtins at those call sites.

But that's legal, and if the function has different semantics than C says it should then you shouldn't be calling it that (and how is it ok to use that in the system's standard library implementation)? This feels to me like fixing the wrong problem.

I have seen some LLVM tests fail because the constant folding of math calls produces different floating point constants when the buildbot runs on different operating systems. This is unlikely to happen here, but if it does and you are compiling with a buggy host libm, you might end up creating a broken target libm.

In D28577#639842, @arichardson wrote:

In D28577#639808, @jrtc27 wrote:

In D28577#639776, @jrtc27 wrote:

Doing so for tests makes sense, but why libm itself?

This is because libm has many functions which are implemented in terms of other math functions. For instance, many functions which handle complex arguments are implemented in terms of calling the non-complex variants. E.g. csinh() calls sinh(), cosh(), sin() and cos(). In all these cases, unexpected outcomes could occur if the compiler deciders to shortcut the logic by inserting its own builtins at those call sites.

But that's legal, and if the function has different semantics than C says it should then you shouldn't be calling it that (and how is it ok to use that in the system's standard library implementation)? This feels to me like fixing the wrong problem.

I have seen some LLVM tests fail because the constant folding of math calls produces different floating point constants when the buildbot runs on different operating systems. This is unlikely to happen here, but if it does and you are compiling with a buggy host libm, you might end up creating a broken target libm.

Not only that, but msun depends for quite a few features on FPU flags being set correctly, and it certainly depends on its own internal implementation details. Unfortunately the base of libm stems from the beginning of the 90s when there was only gcc, and it almost didn't optimize, certainly not calculating math stuff at compile time. Also, it is full of undefined behavior such as writing one union member and then reading another, and pointer aliasing all over the place. This is all very hard to fix since the code is extremely dense and undocumented, full of magic numbers.

In any case, I think I will just change this review to at least build the tests simply with -fno-builtin to that at least the libm.so functions are really tested. We can see about the libm guts itself later (or more probably never... :) ).