Page MenuHomeFreeBSD

libthr: Avoid interposable calls to memcpy/memset/memmove
Changes PlannedPublic

Authored by arichardson on Feb 8 2021, 12:51 PM.

Details

Summary

Calling memcpy() inside handle_signal() breaks TSan since the memcpy()
call is intercepted by TSan. If this happens while another interceptor
is being handled, the TSan runtime crashes. Avoid this by using a local
memcpy() implementation that cannot be intercepted.

This also adds a local memset() implementation since otherwise TSan
reports lots of false-positive races when multiple threads use thr_calloc
(due to memset accessing the same data without a lock visible to TSan).

Supersedes https://reviews.freebsd.org/D28531

Depends on D29536 (-Werror fixes)
Depends on D29535 (-Werror fixes)

Test Plan

LLVM's check-tsan has fewer failures if I set LD_LIBRARY_PATH to load a libthr with this patch.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 38254
Build 35143: arc lint + arc unit

Event Timeline

memcpy is signal safe since POSIX.1-2016, so that is a bug in TSan.

memcpy is signal safe since POSIX.1-2016, so that is a bug in TSan.

That's not the problem, the problem is that TSan intercepts memcpy and doesn't know that it's being called from the libthr signal handler wrapper.

Thread 25 hit Breakpoint 1, __sanitizer::CheckFailed (
    file=0x21d9a9 "/exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/rtl/tsan_mutex.cpp",
    line=157, cond=0x21a06f "((0)) != (0)", v1=0, v2=0)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:72
72	  if (atomic_fetch_add(&num_calls, 1, memory_order_relaxed) > 10) {
(gdb) bt
#0  __sanitizer::CheckFailed (
    file=0x21d9a9 "/exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/rtl/tsan_mutex.cpp",
    line=157, cond=0x21a06f "((0)) != (0)", v1=0, v2=0)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_termination.cpp:72
#1  0x00000000002de08f in __tsan::InternalDeadlockDetector::Lock (this=0x802ce7870, t=__tsan::MutexTypeSyncVar)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/rtl/tsan_mutex.cpp:157
#2  0x00000000002de5be in __tsan::Mutex::ReadLock (this=0x801a8e618)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/rtl/tsan_mutex.cpp:259
#3  0x0000000000318b38 in __tsan::MetaMap::GetAndLock (this=0xbccd88 <__tsan::ctx_placeholder+8>, thr=0x0,
    pc=0, addr=12358296, write_lock=false, create=false)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/rtl/tsan_sync.cpp:228
#4  0x0000000000318ce9 in __tsan::MetaMap::GetIfExistsAndLock (this=0xbccd88 <__tsan::ctx_placeholder+8>,
    addr=12358296, write_lock=false)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/rtl/tsan_sync.cpp:202
#5  0x000000000030d215 in __tsan::Acquire (thr=0x802cc7980, pc=0, addr=12358296)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp:407
#6  0x000000000028727e in __tsan::CallUserSignalHandler (thr=0x802cc7980, sync=false, acquire=true,
    sigact=false, sig=27, info=0x80098e208, uctx=0x80098e288)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1930
#7  0x000000000028085d in __tsan::ProcessPendingSignals (thr=0x802cc7980)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:2001
#8  0x00000000002805a2 in __tsan::ScopedInterceptor::~ScopedInterceptor (this=0x7fffdf9f9760)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:265
#9  0x000000000028adc5 in __interceptor_memcpy (dst=0x7fffdf9f97c0, src=0x7fffdf9f9c00, size=880)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:810
#10 0x00000008004a0bb7 in ?? () from /lib/libthr.so.3
#11 0x00000008004a013f in ?? () from /lib/libthr.so.3
#12 <signal handler called>
#13 0x0000000000278f69 in __tsan::ThreadClock::release (this=0x802cc7bf0, c=0x8016537c0, dst=0x801a91458)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/rtl/tsan_clock.cpp:269
#14 0x000000000030c6f5 in __tsan::ReleaseImpl (thr=0x802cc7980, pc=2641679, c=0x801a91458)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp:513
#15 0x000000000030cb27 in __tsan::MutexReadOrWriteUnlock (thr=0x802cc7980, pc=2641679, addr=34364855568)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/rtl/tsan_rtl_mutex.cpp:353
#16 0x0000000000284f9b in __interceptor_pthread_rwlock_unlock (m=0x8004e1510)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1431
#17 0x00000008004de11f in ?? () from /lib/libgcc_s.so.1
#18 0x00000008004dae46 in ?? () from /lib/libgcc_s.so.1
#19 0x00000008004dac8b in ?? () from /lib/libgcc_s.so.1
#20 0x00000008004d8de0 in ?? () from /lib/libgcc_s.so.1
#21 0x00000008004d90c3 in _Unwind_ForcedUnwind () from /lib/libgcc_s.so.1
#22 0x00000008004a903c in ?? () from /lib/libthr.so.3
#23 0x00000008004a8fb0 in ?? () from /lib/libthr.so.3
#24 0x00000008004a8e5b in pthread_exit () from /lib/libthr.so.3
#25 0x0000000000283174 in __interceptor_pthread_exit (retval=0x0)
    at /exports/users/alr48/sources/upstream-llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1052
#26 0x000000080049afb4 in ?? () from /lib/libthr.so.3
#27 0x0000000000000000 in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffdf9fb000

memcpy is signal safe since POSIX.1-2016, so that is a bug in TSan.

That's not the problem, the problem is that TSan intercepts memcpy and doesn't know that it's being called from the libthr signal handler wrapper.

Which is still a bug in TSan. The code here is perfectly fine; TSan needs fixing instead. This kind of "fix" potentially hurts performance and is fragile.

memcpy is signal safe since POSIX.1-2016, so that is a bug in TSan.

That's not the problem, the problem is that TSan intercepts memcpy and doesn't know that it's being called from the libthr signal handler wrapper.

Which is still a bug in TSan. The code here is perfectly fine; TSan needs fixing instead. This kind of "fix" potentially hurts performance and is fragile.

I agree, but I don't see an easy way of working around this in TSan since it needs to intercept memcpy(). The only way I see for fixing this in TSan is to overwrite all signal handlers and skip the memcpy interception inside signal handlers.
The simpler workaround would be to call a memcpy() alias that is not intercepted by TSan.

Note: This memcpy() was added in R10:02c3c85869f7ceee39a6e721b02df49714b49867, I haven't looked at the details so I'm am not sure why the ucontext_t needs to be copied in the first place. The ideal solution would be if we could remove this copy.

This just reinforces the opinion that libthr should be excluded from the scope of TSan (whatever it is).

In D28536#638754, @kib wrote:

This just reinforces the opinion that libthr should be excluded from the scope of TSan (whatever it is).

Yes, libthr should be excluded from TSan (and it mostly is). The only problem are the memcpy/memset calls since those are intercepted by TSan/ASan/etc. The calls from __thr_malloc() could probably be ignored by setting a flag in each of the pthread_* interceptors, but this one is really hard to detect since it's called directly from the signal handler.

I guess TSan could try to get a backtrace in each memcpy call and ignore it if two frames up is handle_signal(), but that will only work if debug symbols are available.

I don't understand why we need to make a copy of ucp here instead of setting the flag directly. That would avoid a 944 byte copy (on aarch64, CHERI-RISC_V it's 1248 bytes). Is ucp meant to be read-only?

In D28536#638754, @kib wrote:

This just reinforces the opinion that libthr should be excluded from the scope of TSan (whatever it is).

Yes, libthr should be excluded from TSan (and it mostly is). The only problem are the memcpy/memset calls since those are intercepted by TSan/ASan/etc. The calls from __thr_malloc() could probably be ignored by setting a flag in each of the pthread_* interceptors, but this one is really hard to detect since it's called directly from the signal handler.

I guess TSan could try to get a backtrace in each memcpy call and ignore it if two frames up is handle_signal(), but that will only work if debug symbols are available.

What about providing private libthr memcpy, i.e. instead of handle_signal_copy_ucontext(), just add memcpy() somewhere.

I don't understand why we need to make a copy of ucp here instead of setting the flag directly. That would avoid a 944 byte copy (on aarch64, CHERI-RISC_V it's 1248 bytes). Is ucp meant to be read-only?

Because ucp is invalid after sigreturn. This code postpones the call to user-provided signal handler after we left libthr critical section.

Add a local implementation of memcpy/memset instead.

arichardson retitled this revision from libthr: Avoid calling memcpy() while handling signals to libthr: Avoid interposable calls to memcpy/memset/memmove.Apr 1 2021, 2:48 PM
arichardson edited the summary of this revision. (Show Details)
arichardson edited the test plan for this revision. (Show Details)
arichardson edited the summary of this revision. (Show Details)

No, I object against this kind of ugliness. mem* functions are thread- and async-signal-safe. If something cannot intercept them without satisfying the requirements, that something is buggy.

Please just inline memcpy in single place (?) where it is needed. Or provide the local copy for it, again in the place where it is needed, perhaps naming it differently.

That said, globally, do we want to inspect and make some decisions about interposable symbols in libthr?

In D28536#662153, @kib wrote:

No, I object against this kind of ugliness. mem* functions are thread- and async-signal-safe. If something cannot intercept them without satisfying the requirements, that something is buggy.

Please just inline memcpy in single place (?) where it is needed. Or provide the local copy for it, again in the place where it is needed, perhaps naming it differently.

I submitted this version because you previously said What about providing private libthr memcpy, i.e. instead of handle_signal_copy_ucontext(), just add memcpy() somewhere. which is exactly what this patch does.
The previous version provided a non-interposable memcpy for the one use that cannot be worked around inside the tsan runtime, so should I go back to that one?

I think the memset false-positive race condition can be fixed in the tsan runtime.

The problem in TSan is that __interceptor_memcpy (dst=0x7fffdf9f97c0, src=0x7fffdf9f9c00, size=880) is called from a signal handler context, but TSan doesn't know about that and this corrupts the internal state. If I instead intercept __sys_sigaction to make TSan receive the signals and forward them to libthr, this causes even worse internal consistency issues in all sanitizer runtimes. Simply avoiding that one interposable call makes the TSan runtime happy again.

I looked at the previous patch as well. So I do not like either of two. What would TSAN do if user code calls memcpy() from its signal handler?

Note that there are several more memcpy() calls in the signal context, basically all memcpies() in thr_sig() except in thr_setcontext() and thr_swapcontext() are.

I am against adding libc code for this. I propose that we define thr_sig_memcpy(), like handle_signal_copy_ucontext() in your previous patch, but having memcpy() signature and operating on bytes. Then replace all memcpy() calls except the two with thr_sig_memcpy().

If you are fed with me, I can code this myself, sorry.

Sorry, I probably didn't explain the problem correctly. I will do some more debugging next week to narrow down the exact cause of this problem.

TSan (and other sanitizers) intercepting all user-provided signal handlers (they install their own signal handlers by interposing sigaction). I believe the issue with TSan deadlock is that the interposed memcpy is called between the return of the TSan-internal signal handler and the actual return from the real signal handler.
I tried interposing __sys_sigaction instead so that the TSan runtime has registered "real" signal handler, but this causes even worse issues. Not calling a function interposed by the TSan runtime (in this case memcpy()) between the return of the user signal handler and SYS_sigreturn appears to be sufficient to fix this crash (which makes -fsanitize=thread unusable on FreeBSD).

Interposing all other calls to memcpy() is perfectly fine, it's just this one that causes issues.