Page MenuHomeFreeBSD

libelf: add note desc endian conversion
AcceptedPublic

Authored by emaste on Mar 5 2020, 3:01 PM.

Details

Summary

Previously we just copied a note's data (desc) verbatim when translating between different endianness. Add translation for the four existing FreeBSD notes, all of which contain one or more 32-bit words.

Diff Detail

Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

contrib/elftoolchain/libelf/libelf_convert.m4
920

This can be for (; len >= 4; len -= 4) {.

928

The descriptor size passed to this function is already rounded up to a multiple of 4, so we are potentially swapping pad bytes here.

1017–1021

Maybe say, "copy note name and padding".

1105

Put this block in a subroutine?

contrib/elftoolchain/libelf/libelf_convert.m4
920

OK, I suppose that's a bit clearer.

928

I guess we should really pass the un-rounded-up descsz in, although in practice right now all instances have exactly 1 32-bit value.

contrib/elftoolchain/libelf/libelf_convert.m4
928

Oh, I got confused and didn't see that we were simply copying the rest of the descriptor here, so I guess that's fine.

incorporate some feedback from @markj

contrib/elftoolchain/libelf/libelf_convert.m4
1078

Why did this change?

contrib/elftoolchain/libelf/libelf_convert.m4
1078

Silently truncating is bogus, we should return false (0) if the buffer is smaller than the claimed descsz/namesz. Note that we don't do it in the _tom case.

I could commit this separately first.

contrib/elftoolchain/libelf/libelf_convert.m4
1078

It seems a bit magical to me that libelf knows about certain note types. This behaviour is part of the API, i.e., if you start swapping for a given note type, consumers which already handled that will be broken, I believe. So I'm not really certain this change is the right way to go about this. Maybe it should be requested explicitly by the caller through a new API. In any case the implementation looks right to me.

contrib/elftoolchain/libelf/libelf_convert.m4
1078

Thanks.

This revision is now accepted and ready to land.Mar 5 2020, 9:19 PM

I'm not sure if I have the full context here -- what specifically is the problem being addressed by this change?

This is addressing ticket 583, and specifically

For the readelf the best approach is probably to teach libelf how to translate the ELF notes used by various operating sytems?

@markj writes

It seems a bit magical to me that libelf knows about certain note types. This behaviour is part of the API, i.e., if you start swapping for a given note type, consumers which already handled that will be broken, I believe.

It is a little magical, but libelf already translates everything else to/from host order; my take is that this is just a longstanding oversight. I am not aware of libelf consumers which do their own swapping of note desc data, and there are consumers which are currently buggy and are fixed by this change (like readelf).

emaste> This is addressing ticket 583, and specifically
emaste> For the readelf the best approach is probably to teach libelf how to translate the ELF notes used by various operating sytems?

Oh I see -- I think I might have misunderstood your comment in Elftoolchain#583.

The fields of an ELF note's header need to be byte swapped in order to parse the note itself because these fields are stored with the ELF object's native enddianness. However the contents of the note's 'name' and 'descr' fields that follow the header are pretty much up to the application, and these are defined differently by various OS projects: please see NetBSD's documentation or the Linux elf(5) manual page or the glibc definition. Sometimes the descr field needs to be byte swapped when reading or writing, while at other times it holds a string which should not be byteswapped. Sometimes the actual structure of an OS-specific ELF note is quite different from that stated in its documentation.

Making libelf aware of every possible ELF note type (& its possible evolution) would be awkward. But requiring that every application needs to know how to parse (or create) the OS-specific byte strings that make up Elf Note name and descr values would be awkward too.

So I was instead thinking of making libelf's section translation support extensible -- Elftoolchain#585. The idea was that libelftc (say) would build on this by offering a "library" of parsers of known ELF note types.

the contents of the note's 'name' and 'descr' fields

Not quite, the 'name' is a string (well, mostly a string) and is never swapped.

Making libelf aware of every possible ELF note type (& its possible evolution) would be awkward.

This is true, but IMO less awkward than repeating this logic in every libelf consumer. Yes, dealing with new notes will be a challenge.

But requiring that every application needs to know how to parse (or create) the OS-specific byte strings that make up Elf Note name and descr values would be awkward too.

Inteed. readelf is currently broken because of this:

% readelf/readelf -n ls.mips32

Notes at offset 0x0000018c with length 0x00000030:
  Owner         Data size       Description
  FreeBSD       0x00000004      NT_FREEBSD_ABI_TAG
   ABI tag: 3209630208
  FreeBSD       0x00000004      NT_FREEBSD_NOINIT_TAG
   description data: 00 00 00 00

3209630208 is 1200063 byte-swapped.

Pragmatically speaking there aren't really that many OS-specific ELF notes, and most of them are going to be either strings or arrays of 4- or 8-byte integers and handled relatively easily. It's a tractable problem to address them for all of the relevant ELF-based operating systems.

Translating CORE notes would be more of a challenge.

jhibbits added inline comments.
contrib/elftoolchain/libelf/libelf_convert.m4
1009–1010

I think this is intended to maintain alignment, but it affects the size of the note, and if only one note exists, and it's an odd length, this will cause it to fail. I noticed this when debugging mesa DRI for powerpc64.

contrib/elftoolchain/libelf/libelf_convert.m4
1009–1010

I re-read the elfutils code, and it looks like this part is correct, for the most part. However, down below is a problem. elfutils copies the remaining data to the buffer, instead of returning error. The note convert function (elf_cvt_note()) is a void anyway, but it doesn't signal any kind of error, it just copies out the truncated data. I think this should be done here, for non-multiple-of-4 blocks (such as string notes....). It shouldn't require any more space than doing the no-swap memcpy() should it?

I don't really think this belongs internally in libelf FWIW. What I would like is a way to create a new Elf_Data for a given (void *, len) pair and then use gelf_xlatetom() to generate a translated version. The code in readelf that parses FreeBSD notes could then choose to use an Elf_Data with a type of ELF_T_WORD in this way to do the byte swaps.

Alternatively, for readelf alone just having a simple 'uint32_t elf_xlatetom32(Elf *elf, uint32_t value)' would also be useful and would be fairly easy to use in existing code. (In this case the 32 size is fixed, so it doesn't need to be gelf_ I think). Ideally this would use EI_DATA from Elf instead of requiring it as an argument the way the other *xlate* functions do.