The need to use libc malloc(3) from some places in libthr always caused issues. For instance, per-thread key allocation was switched to use plain mmap(2) to get storage, because some third party mallocs used keys for implementation of calloc(3).
Even more important, libthr calls calloc(3) during initialization of pthread mutexes, and jemalloc uses pthread mutexes. jemalloc provides some way to both postpone the initialization, and to make initialization to use specialized allocator, but this is very fragile and often breaks. See the referenced PR for another example.
Add the small malloc implementation used by rtld, to libthr. Use it in thr_spec.c and for mutexes initialization. This avoids the issues with mutual dependencies between malloc and libthr in principle. The drawback is that some more allocations are not interceptable for alternate malloc implementations. There should be not too much memory use from this allocator, and the alternative, direct use of mmap(2) is obviously worse.