The above commit fixed handling overaligned TLS segments in libc's TLS Variant
I implementation, but rtld provides its own implementation for
dynamically-linked executables which lacks these fixes. Thus, port these
changes to rtld.
Details
Compile and run https://reviews.freebsd.org/P198 as a dynamic executable,
checking addresses are 4K-aligned and the values are correct. I have only
tested this on mips (specifically mips64); aarch64, arm, powerpc, powerpc64 and
riscv are untested, as well as mips32.
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Not Applicable - Unit
Tests Not Applicable
Event Timeline
libexec/rtld-elf/rtld.c | ||
---|---|---|
4707 ↗ | (On Diff #46025) | Perhaps add another MD macro for this expression. It could be also used in allocate_tls() and free_tls(). |
I tested this on RISC-V. It boots multiuser fine.
P198 test application gives this results:
root@qemu:~ # ./test1
&aligned_var_ie: 0x40048000
aligned_var_ie: 0x420043
&aligned_var_le: 0x40049000
aligned_var_le: 0x470048
Thanks, anything obviously wrong for 64-bit? Could you please print the value of register 13? Something like the following should work:
long r13; __asm__ ("addi %0,13,0" : "=r"(r13)); printf("r13: 0x%lx\n", r13); // as before ...
Fixed powerpc64's rtld_machdep.h (missed in the previous version), and introduced new calculate_tls_post_size MD macro.
Never mind, I'm an idiot, I just completely missed that there was a powerpc64 directory, rather than sharing an implementation with powerpc. Could you please check it's now fixed?
@jrtc27_jrtc27.com latest version of the patch on -m64:
&aligned_var_ie: 0x810058000 aligned_var_ie: 0x420043 &aligned_var_le: 0x810059000 aligned_var_le: 0x470048
ARMv7:
root@poudriere-image:~ # ./a.out &aligned_var_ie: 0x20046000 aligned_var_ie: 0x420043 &aligned_var_le: 0x20047000 aligned_varle: 0x470048
Firstly, I want to say that this patch is OK. It only slightly changed memory layout for TLS segment (because it uses malloc_aligned()) which exposed unrelated bug in ARM TLS relocation.
Originally, I found that this patch breaks bash on armv7. It started to take that 'a' is non-alphanumeric character, 'x' is white space. Simply, all ctype related function (isspace()...) was broken because _ThreadRuneLocale thread local variable was not been cleared on start, but was point to invalid (but mapped) memory location.
And after 3 days of digging on this I think that I known why.
The R_ARM_TLS_TPOFF32 relocation is on ARM defined as "S + A – tp".
But, on https://svnweb.freebsd.org/base/head/libexec/rtld-elf/arm/reloc.c?revision=328834&view=markup#l328
we do "tmp = (Elf_Addr)def->st_value + defobj->tlsoffset + TLS_TCB_SIZE"
(note the /* XXX: FIXME */ above this line:) )
The "+ TLS_TCB_SIZE" part is clearly superfluous, this offset is already contained in defobj->tlsoffset, and we always misrelocate it by 8.
The only consumer (for all binaries in /usr/bin and /usr/local/bin on my ARM board) of R_ARM_TLS_TPOFF32 is _ThreadRuneLocale variable. And accidentally misrelocated _ThreadRuneLocale was zero in all cases (before to this patch). For patched version (due to memory layut change) and only for complex cases (like bash is) it is clashed with another TLS variable/memory.
This patch fixes all my troubles:
Index: libexec/rtld-elf/arm/reloc.c =================================================================== --- libexec/rtld-elf/arm/reloc.c (revision 338066) +++ libexec/rtld-elf/arm/reloc.c (working copy) @@ -325,8 +325,7 @@ return -1; /* XXX: FIXME */ - tmp = (Elf_Addr)def->st_value + defobj->tlsoffset + - TLS_TCB_SIZE; + tmp = (Elf_Addr)def->st_value + defobj->tlsoffset; if (__predict_true(RELOC_ALIGNED_P(where))) *where = tmp; else ===================================================================
All whats I want now is another pair of eye(s) to verify if my observation is right and some time (1, max 2 days) to finish buildworld and some ports build.
If all gets OK I will commit my patch ASAP and this one can be reapplied.
Any comments?
Good sleuthing! Do you think we can also drop the FIXME since this will now seem to work?
Curiously, the code in NetBSD includes the equivalent of '+ TLS_TCB_OFFSET' but without an XXX, so I wonder if gonzo@ knew this looked bogus but kept it out of caution when merging since he couldn't test it? In that case, it seems that since you've now tested this case and confirmed it needs fixing, I do feel like we can probably drop it. Maybe ask gonzo@ to what he thinks?