As part of the RISC-V ABI, the gp register is expected to initialized
with the address of __global_pointer$ as early as possible. This allows
loads and stores from .sdata to be relaxed based on the value of gp. In
locore.S we do this initialization twice, once each for va and mpva.
However, in both cases the initialization is preceded by an la
instruction, which in theory could be relaxed by the linker.
Move the initialization of gp to be slightly earlier (before la
cpu_exception_handler), and add an additional gp initialization at the
very beginning of _start, before virtual memory is set up.
Reported by: jrtc27