_end is a special symbol emitted by the static linker. It is supposed
to follow the .bss section in memory and is used by libc to set the
initial break (curbrk and minbrk). GNU ld and lld have differing
behaviour around this symbol, leading to incompatibilities when libc.so
and an executable using brk()/sbrk() are linked using different linkers.
For instance, libc.so may end up using its own internal definition of
_end instead of that of the executable. Some details are described in
PR 228574.
To avoid issues of compatibility, rewrite brk() and sbrk() to avoid
using _end. To accomplish this, we add a return value to the kernel's
break() system call and change its error handling slightly. Previously,
break() would return EINVAL if the new break address was smaller than
the address of the beginning of the data segment. With this change,
such an input has no effect. The syscall returns its notion of the
break address (always page-aligned), so break(0) can be used to query
the kernel for the current break address. I believe this change is
backwards-compatible. Further, note that brk() implementations
silently clamp the input break address to minbrk, so brk(0) behaves the
same now as it did before. The syscall itself is undocumented and not
exported by libc.
Before this change, the brk() and sbrk() libc functions were implemented
separately for each supported platform. This change consolidates the
implementations with a rewrite in C and adds a few simple tests. Both
brk() and sbrk() take care to initialize curbrk and minbrk upon first
use, incurring a penalty of one extra system call per process which uses
these interfaces. Given that brk() and sbrk() are rarely used today, I
think this is acceptable.
This change also adds sbrk.o to NOASM on all architectures. The sbrk()
system call is and has always been a no-op.