The offset here is relative to the TCB, not whatever the thread pointer
points to, so as with powerpc and powerpc64 we need to account for that.
However, rather than using hard-coded offsets as they did, due to
predating machine/tls.h, we can just re-use _tcb_get().
Note that if libthr is used, and its initialiser has been called, it
will take a different path that uses _get_static_tls_base, which works
just fine on riscv (adding the offset to thr->tcb). This only affects
programs that aren't linked against libthr (or that are but manage to
dlopen before the initialiser is called, if that's even possible).
In future this code should be made MI by just reusing _tcb_get() and
checking the TLS variant (since the offset here is positive even for
variant II, where it should be subtracted), but this is a targeted fix
that makes it clear what's changing.
Fixes: 5d00c5a6571c ("Fix initial exec TLS mode for dynamically loaded shared objects.")
MFC after: 1 week