Page MenuHomeFreeBSD

ELF: detect and reject CheriABI binaries
Needs ReviewPublic

Authored by brooks on Tue, Feb 24, 11:49 AM.
Tags
None
Referenced Files
F145855388: D55480.id172576.diff
Wed, Feb 25, 6:41 AM
F145855331: D55480.id172594.diff
Wed, Feb 25, 6:40 AM
Unknown Object (File)
Wed, Feb 25, 12:43 AM
Unknown Object (File)
Wed, Feb 25, 12:33 AM
Unknown Object (File)
Wed, Feb 25, 12:29 AM
Unknown Object (File)
Wed, Feb 25, 12:25 AM
Unknown Object (File)
Wed, Feb 25, 12:21 AM
Unknown Object (File)
Wed, Feb 25, 12:04 AM
Subscribers

Details

Reviewers
kib
olce
emaste
jhb
andrew
manu
Group Reviewers
cheri
Summary

On arm64, CheriABI binaries are current distinguished from non-CHERI
binaries by an ELF flag. Arm has reserved one for Morello. By
rejecting these binaries, we allow other alternative ABIs to match. In
particular, we can use binmiscctl to enable QEMU user mode on native
arm64 systems to support cross compilation for Morello.

Later in the CHERI upstreaming process we'll expand these checks to
require that binaries support CHERI by default on CHERI targets
(introducing a freebsd64 layer for integer-pointer ABIs). We'll also
expand checks to RISC-V as appropriate once the psABI team picks an
approach (CheriBSD currently uses a flag in the reserved space.)

Effort: CHERI upstreaming
Requested by: def
Sponsored by: DARPA, AFRL
Co-authored-by: John Baldwin <jhb@FreeBSD.org>

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 70958
Build 67841: arc lint + arc unit

Event Timeline

jrtc27 added inline comments.
sys/riscv/include/elf.h
66 ↗(On Diff #172576)

This one probably shouldn't be upstreamed. The flag is in the reserved space, not the custom space (no custom space existed at the time of defining it).

sys/riscv/include/elf.h
66 ↗(On Diff #172576)

I can drop this part. It's less useful than the arm64 part.

brooks marked an inline comment as done.
sys/sys/elf_common.h
349

Should this be sorted before Arm and be in its own block? (LoongArch screwed up the consistent ordering for the former...)

sys/sys/elf_common.h
349

Yeah. I have some more proposed additions in D55486 and noticed a few other misordered entries.

EF_AARCH64 should be before EF_ARM.

  • Rebase
  • Sort EF_AARCH64 before EF_ARM

First, there are two unrelated changes. One for the in-kernel ELF image activator, another for the kernel linker.

Then, for the image activator, why the is_cheri checks needs to be done in MI code? We have brand matchers (header_supported/brand_supported), which get the pointer to the mapped ELF header. Why cannot it be done there, in <arch>/elf_machdep.c?

In D55480#1269551, @kib wrote:

First, there are two unrelated changes. One for the in-kernel ELF image activator, another for the kernel linker.

Then, for the image activator, why the is_cheri checks needs to be done in MI code? We have brand matchers (header_supported/brand_supported), which get the pointer to the mapped ELF header. Why cannot it be done there, in <arch>/elf_machdep.c?

You could make the same argument for e_machine/EI_CLASS/EI_DATA. Like those, this is an MI concept, so it's in MI code rather than having MD boilerplate (that, I will note, also has some odd semantics when it comes to the bool-returning functions, though I don't remember off the top of my head what those were). The MD hooks are for MD things like "v1 or v2 of this architecture's psABI". Whereas this is part of ILP32-vs-LP64-vs-"L64PC128" (or whatever name gets given to it, if any), and corresponds to a specific instantiation of the imgact_elf.c template (Elf32 vs Elf64 vs "Elf64C")

In D55480#1269551, @kib wrote:

First, there are two unrelated changes. One for the in-kernel ELF image activator, another for the kernel linker.

Then, for the image activator, why the is_cheri checks needs to be done in MI code? We have brand matchers (header_supported/brand_supported), which get the pointer to the mapped ELF header. Why cannot it be done there, in <arch>/elf_machdep.c?

You could make the same argument for e_machine/EI_CLASS/EI_DATA. Like those, this is an MI concept, so it's in MI code rather than having MD boilerplate (that, I will note, also has some odd semantics when it comes to the bool-returning functions, though I don't remember off the top of my head what those were). The MD hooks are for MD things like "v1 or v2 of this architecture's psABI". Whereas this is part of ILP32-vs-LP64-vs-"L64PC128" (or whatever name gets given to it, if any), and corresponds to a specific instantiation of the imgact_elf.c template (Elf32 vs Elf64 vs "Elf64C")

Right now e_machine/class/data/version together form the arch identifier, it was not well thought and came from times where the scope of acceptance of the ELF was not envisioned. For instance, they reserved separate machine ids for 386 vs 486. Then they used different machine id for x86_64 instead of using 64bit class etc.

We do not distinguish arch-specific ISAs at imgact_elf. We do not determine the linux binaries there, it is the job of arch-specific brand. IMO we must not start doing that for cheri.

BTW, it is up to you that you want to identify chri binaries with some flags, but I am surprised that you do that instead of adding separate machine Ids since ISA is different.

In D55480#1269816, @kib wrote:
In D55480#1269551, @kib wrote:

First, there are two unrelated changes. One for the in-kernel ELF image activator, another for the kernel linker.

Then, for the image activator, why the is_cheri checks needs to be done in MI code? We have brand matchers (header_supported/brand_supported), which get the pointer to the mapped ELF header. Why cannot it be done there, in <arch>/elf_machdep.c?

You could make the same argument for e_machine/EI_CLASS/EI_DATA. Like those, this is an MI concept, so it's in MI code rather than having MD boilerplate (that, I will note, also has some odd semantics when it comes to the bool-returning functions, though I don't remember off the top of my head what those were). The MD hooks are for MD things like "v1 or v2 of this architecture's psABI". Whereas this is part of ILP32-vs-LP64-vs-"L64PC128" (or whatever name gets given to it, if any), and corresponds to a specific instantiation of the imgact_elf.c template (Elf32 vs Elf64 vs "Elf64C")

Right now e_machine/class/data/version together form the arch identifier, it was not well thought and came from times where the scope of acceptance of the ELF was not envisioned. For instance, they reserved separate machine ids for 386 vs 486. Then they used different machine id for x86_64 instead of using 64bit class etc.

We do not distinguish arch-specific ISAs at imgact_elf. We do not determine the linux binaries there, it is the job of arch-specific brand. IMO we must not start doing that for cheri.

BTW, it is up to you that you want to identify chri binaries with some flags, but I am surprised that you do that instead of adding separate machine Ids since ISA is different.

We don't really like using flags like this, but it's complicated. Especially for something like RISC-V that foolishly uses the same EM_RISCV for both 32-bit and 64-bit ISAs, and both RVE and RVI base ISAs (16 vs 32 GPRs).

Whether what's done for machine/class/data is best or not, I would much rather we treat however this is encoded in the ELF in the same way as those. It's far less confusing if you can just pretend it's part of that "core ABI tuple".