loader: support for mixed-endianness ELF/loader and POWER8
ClosedPublic
Actions

Authored by wma on Sep 20 2017, 11:09 AM.

Details

Reviewers

nwhitehorn
jhibbits
adrian
mw

Commits

rS328536: loader: support for mixed-endianness ELF/loader and POWER8

Summary

On POWER8 with the latest petitpoot, the loader.kboot is being
run as little-endian application. The FreeBSD kernel is
always big-endian on PPC, so the load_elf_* routines must be aware
of proper endianness of all fields.

Also:

dynamically resolve the base address for kexec load
modify trampoline code to be mixed-endianness compatible
fix syscall implementation on PPC Linux

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

wma created this revision.Sep 20 2017, 11:09 AM

Herald added a subscriber: imp. · View Herald TranscriptSep 20 2017, 11:09 AM

imp added inline comments.Sep 20 2017, 1:23 PM

sys/boot/common/load_elf.c
90–91 ↗	(On Diff #33238)	Since there's only 3 of them (each) I'd be tempted to encode the bits into the macro name. I think the code will be more readable that way... Just a thought.
681–690 ↗	(On Diff #33238)	Why can't all this be inside of arch_copyin?
747–748 ↗	(On Diff #33238)	Don't like the duplication. Can this code be written so that you don't need it?
sys/boot/powerpc/kboot/main.c
39 ↗	(On Diff #33238)	I'd remove this
sys/boot/powerpc/kboot/ppc64_elf_freebsd.c
165 ↗	(On Diff #33238)	I thought __dead2 was the right way to tag a function to keep the compiler happy...

wma added inline comments.Sep 20 2017, 2:52 PM

sys/boot/common/load_elf.c
681–690 ↗	(On Diff #33238)	arch_copyin is responsible only for copying, and is not aware of the nature of the data provided - it's just an architecture-specific equivalent of bcopy.
sys/boot/powerpc/kboot/ppc64_elf_freebsd.c
165 ↗	(On Diff #33238)	Might be. Tomorrow I'll check why it was added and remove if not required.

wma added inline comments.Sep 20 2017, 2:59 PM

sys/boot/powerpc/kboot/ppc64_elf_freebsd.c
137 ↗	(On Diff #33238)	Uh, should be htobe here as well.

Some advice from an IBM firmware team member:
‎[22:51] ‎<‎segher‎>‎ ah ok. some review... don't use endian-swapping macros; use a macro to do the actual memory accesses in the endian you want, instead
‎[22:52] ‎<‎segher‎>‎ and yup, very cool to see some other os on p8 :-)

Kevin, is there any explanation for this change?
All ELF parsing code is machine-independent. Some architectures don't have instructions allowing choose of data endianness, so I don't see a way how could we avoid using generic byte-swap macros.
I've used instructions you mentioned in the machine-specific trampoline code, which is a part of ppc64-kernel bootstrap and where we no longer care about code interoperatibility.

Also, most changes here touch only the loader.kboot, which is a Linux appliation being run to start the fbsd kernel.

I'm just relaying a conversation :)

‎[02:14] ‎<‎segher‎>‎ https://commandcenter.blogspot.nl/2012/04/byte-order-fallacy.html
‎[02:15] ‎<‎segher‎>‎ pike indepently reached the same conclusion as i did, so we must both be right ;-)
‎[02:16] ‎<‎segher‎>‎ the point is that you *never* have to byteswap
‎[02:16] ‎<‎segher‎>‎ either read or write in either BE or LE
‎[02:16] ‎<‎segher‎>‎ if stuff is not in memory, it *has* no endianness
‎[02:24] ‎<‎segher‎>‎ the point is that you should not access "foreign" data as if it is "native"

Freenode #ppc64 if you want to go direct, a lot of the firmware and kernel team are there

If I understood you correctly, you're referring to the part such as load_elf, lines 131-177.
The purpose of this code is to read the ELF header and store it as ehdr structure, which can be accessed by CPU. To do that, following algorithm is used:

Read the ELF header structure and store it in the memory. Loader operates on blocks of data, so we're loading the whole chunk of data at once using "block read" operation.
We might fall into sutuation, that some ELF header contents are BE-encoded while the loader runs as LE.
An exact ehdr structure shape denepends on the ELF arch/bits/etc. and is generated during compilation.

I read Pike's blog, and agree that having

i = *((int*)data);
#ifdef BIG_ENDIAN
/* swap the bytes */
i = ((i&0xFF)<<24) | (((i>>8)&0xFF)<<16) | (((i>>16)&0xFF)<<8) | (((i>>24)&0xFF)<<0);
#endif

is just dumb. However, the case here is sligtly different

function load_elf_header(...) {
 read(fd, ehdr, PAGE_SIZE)
 /* The ELF might have values either BE or LE encoded, so once the structure is loaded to memory using " block read", we need to be aware of its endianness */
 // here goes the swapping code making structure fields native-endian
 xETOH(ehdr->e_type, 16);  // etc.
}

/* generic code, which uses ehdr->e_xxx values to navigate through ELF file */
printf("%x", ehdr->e_entry);  // etc.

The problem here has two solutions:

Swap the fields just after the blob containing elfhdr is loaded into the memory, since that data is only read and used by the loader code itself. This is the way how it's implemented here.
Leave structure fields in as-loaded endianness and provide accessors for the structure fields performing Xetoh translation when necessary. I didn't go that path, because:
- the accessors will be complex and every layer would require an access to the ehdr to resovle class and endianness. For example, the section header accessor would be like

shdr_XXX_get(ehdr, shdr) {
   if (ehdr->e_ident[EI_DATA] == ELFDATA2MSB)
      if (ehdr->e_ident[EI_CLASS] == ELFCLASS64)
         return  be64toh(shdr->XXX);
      else
         return  be32toh(shdr->XXX);
   else
      if (ehdr->e_ident[EI_CLASS] == ELFCLASS64)
         return  le64toh(shdr->XXX);
      else
         return  le32toh(shdr->XXX);
}

changing every possible place which access hdr->sh_XXX directly to use accessors will generate diff having about 2-3k lines changed. I just don't see a reason for such complication, especially that it all maps to the same operation - swapping the structure fields before they're used.
it is more logical that the function called elf_load_header returns a header structure which fields can be used directly without any black magic
I wanted to have an ugly and error prone code contained to one file instead of spread across numerous places :)

Just to be clear, I'm not opting for my solution and I'm OK with either #1 or #2. It's just a matter of which one suits our needs.

wma updated this revision to Diff 33551.Sep 29 2017, 7:42 AM

Any objections for commiting this?

This revision was not accepted when it landed; it landed in state Needs Review.Jan 29 2018, 9:24 AM

Closed by commit rS328536: loader: support for mixed-endianness ELF/loader and POWER8 (authored by wma). · Explain Why

This revision was automatically updated to reflect the committed changes.

emaste added a subscriber: jhb.Jan 31 2018, 3:12 PM

emaste mentioned this in rS328826: Make cross-endian loader changes apply only to powerpc.Feb 3 2018, 1:24 AM

luporl mentioned this in D32160: powerpc64le: stand fixes.Sep 27 2021, 8:04 PM