Page MenuHomeFreeBSD

Workaround lld's inability to handle undefined weak symbols on risc-v.
ClosedPublic

Authored by jhb on Jan 7 2020, 12:31 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Apr 18, 8:51 PM
Unknown Object (File)
Dec 23 2023, 10:53 AM
Unknown Object (File)
Nov 25 2023, 7:56 PM
Unknown Object (File)
Nov 11 2023, 11:06 PM
Unknown Object (File)
Nov 11 2023, 5:00 PM
Unknown Object (File)
Nov 9 2023, 4:42 PM
Unknown Object (File)
Oct 10 2023, 10:05 PM
Unknown Object (File)
Oct 10 2023, 7:01 PM
Subscribers

Details

Summary

lld on RISC-V is not yet able to handle undefined weak symbols for
non-PIC code in the code model (medany/medium) used by the RISC-V
kernel.

Both GCC and clang emit an auipc / addi pair of instructions to
generate an address relative to the current PC with a 31-bit offset.
Undefined weak symbols need to have an address of 0, but the kernel
runs with PC values much greater than 2^31, so there is no way to
construct a NULL pointer as a PC-relative value. The bfd linker
rewrites the instruction pair to use lui / addi with values of 0
to force a NULL pointer address. (There are similar cases for
'ld' becoming auipc / ld that bfd rewrites to lui / ld with an
address of 0.)

To workaround this, compile the kernel with -fPIE when using lld.
This does not make the kernel position-independent, but it does
force the compiler to indirect address lookups through GOT entries
(so auipc / ld against a GOT entry to fetch the address). This
adds extra memory indirections for global symbols, so should be
disabled once lld is finally fixed.

A few 'la' instructions in locore really need to use auipc / addi
and not indirect via GOT entries, so change those to use 'lla'
which always uses auipc / addi for both PIC and non-PIC.

Test Plan
  • boot a clang+lld compiled risc-v kernel under qemu

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

The patch is from James. I tweaked it slightly to move the Makefile change out of kern.mk so it was simpler to have it not impact kernel modules.

Ah yes that's nicer having it in kern.pre.mk, I think I naively just assumed kern.mk included kern.pre/post.mk and thus it wouldn't have helped.

In the commit message it might be worth clarifying that it's *all* the la's that happen before we set up virtualisation, since they need to get physical addresses rather than the future virtual ones.

I also note we're playing a bit fast and loose with linker relaxations here; all those llas before we set up gp risk being relaxed to GP-relative (although unlikely given none of these symbols are in .sdata so it's a game of chance whether it happens to be placed near enough, and also prone to breakage if this code is edited later to refer to other symbols), as does the la of cpu_exception_handler. I think we should be initialising gp twice; once as the first thing in _start to get it in the physical world, and again at the start of va (and mpva) to get the virtual address as we do now (but earlier). But that's something for another patch.

Nit: workaround is a noun not a verb.

Ah yes that's nicer having it in kern.pre.mk, I think I naively just assumed kern.mk included kern.pre/post.mk and thus it wouldn't have helped.

In the commit message it might be worth clarifying that it's *all* the la's that happen before we set up virtualisation, since they need to get physical addresses rather than the future virtual ones.

Ok.

I also note we're playing a bit fast and loose with linker relaxations here; all those llas before we set up gp risk being relaxed to GP-relative (although unlikely given none of these symbols are in .sdata so it's a game of chance whether it happens to be placed near enough, and also prone to breakage if this code is edited later to refer to other symbols), as does the la of cpu_exception_handler. I think we should be initialising gp twice; once as the first thing in _start to get it in the physical world, and again at the start of va (and mpva) to get the virtual address as we do now (but earlier). But that's something for another patch.

To be clear, that's true of the existing code as well, not a "new" thing relative to the change to lla? The la virtmap at least always has to come before a load of gp. I guess if being paranoid we could use an explicit lui / addi sequence with %lo, etc.?

Nit: workaround is a noun not a verb.

Will fix in the real message.

In D23064#505712, @jhb wrote:

I also note we're playing a bit fast and loose with linker relaxations here; all those llas before we set up gp risk being relaxed to GP-relative (although unlikely given none of these symbols are in .sdata so it's a game of chance whether it happens to be placed near enough, and also prone to breakage if this code is edited later to refer to other symbols), as does the la of cpu_exception_handler. I think we should be initialising gp twice; once as the first thing in _start to get it in the physical world, and again at the start of va (and mpva) to get the virtual address as we do now (but earlier). But that's something for another patch.

To be clear, that's true of the existing code as well, not a "new" thing relative to the change to lla? The la virtmap at least always has to come before a load of gp. I guess if being paranoid we could use an explicit lui / addi sequence with %lo, etc.?

Yes, the existing code is risking that just as much as with this patch (lla and la behave identically when you have no -fPIE/-fPIC involved). la virt_map doesn't need to come first? Our gp at boot would be the physical address of __global_pointer$, so anything PC-relative would become GP-relative and yield the same address given both are physical. Then right at the start of va we'd first set gp to be the virtual address of __global_pointer$ to match PC now being a virtual address. so that once again GP-relative calculations give the same value as PC-relative. Issues only come when relaxations are available but gp doesn't point to __global_pointer$ in the same address space as PC (whether that's because it's in the wrong address space, or has yet to be initialised). And the paranoid solution would be .option norelax, not %hi/%lo, which comes with two issues: 1. it's not position independent, despite our medany code model, so if you linked outside of the [-2GB,+2GB] range (due to lui's sign extension) it would break and 2. by being absolute, you would always get its virtual address, which defeats the whole point of doing a PC-relative calculation for virt_map to get its physical address.

This revision was not accepted when it landed; it landed in state Needs Review.Jan 7 2020, 11:18 PM
This revision was automatically updated to reflect the committed changes.