Page MenuHomeFreeBSD

addr2line: add label checks when DW_AT_range and DW_AT_low_pc cannot be used
ClosedPublic

Authored by tig_freebsdfoundation.org on Feb 21 2020, 2:35 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Nov 22, 9:52 AM
Unknown Object (File)
Fri, Nov 15, 5:41 AM
Unknown Object (File)
Mon, Nov 4, 7:02 PM
Unknown Object (File)
Sun, Oct 27, 7:58 AM
Unknown Object (File)
Oct 6 2024, 11:43 PM
Unknown Object (File)
Sep 29 2024, 10:24 AM
Unknown Object (File)
Sep 28 2024, 8:36 AM
Unknown Object (File)
Sep 27 2024, 10:27 PM

Details

Summary

Check label's ranges for address we want to translate if a CU doesn't have usable DW_AT_range or DW_AT_low_pc.
Use more appropriate names: "struct CU" -> "struct range"
PR: https://sourceforge.net/p/elftoolchain/tickets/552/

Test Plan

Passing my test with random addresses, sequential addresses, DW_AT_range tests, and label tests.
While testing, I found some small/trivial discrepancy between GNU and elftoolchain output. If you give the program the low_pc of a function(subprogram), GNU will output first line of function's c code that's not a macro. Elftoolchain will output the line that has the opening bracket "{" of the function.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Check labels seems to be using a lot of memory (20 Mb). There might be a bug somewhere. I'm looking into it.

It looks like it's just the allocated memory for labels, as most of the memory used by check_labels are given back (freed) according to the picture.
Kernel probably has a lot of assembly files with a lot of labels. Is memory usage a problem?

image.png (895×415 px, 18 KB)

It looks like it's just the allocated memory for labels, as most of the memory used by check_labels are given back (freed) according to the picture.
Kernel probably has a lot of assembly files with a lot of labels. Is memory usage a problem?

image.png (895×415 px, 18 KB)

Yes, I can imagine that e.g., exception.S generates quite a few labels.

How does the memory usage compare to GNU addr2line? Maybe we can redesign the cache so that it consumes less memory.

contrib/elftoolchain/addr2line/addr2line.c
973 ↗(On Diff #68636)

There is no harm in calling dwarf_get_aranges() each time you need to use it. libdwarf's implementation first checks to see if it has already initialized the arange array for this particular dbg handle, and simply returns it if so. The first call does the initialization. So this caching that you're doing doesn't really help with anything, I believe.

I also don't think it makes sense to print a warning if dwarf_get_aranges() returns DW_DLV_NO_ENTRY. If you get anything other than DW_DLV_OK or DW_DLV_NO_ENTRY it should probably be a fatal error.

tig_freebsdfoundation.org added inline comments.
contrib/elftoolchain/addr2line/addr2line.c
973 ↗(On Diff #68636)

I see. Moved dwarf_get_aranges to check_labels and when to print warning.

I still have to read check_labels() carefully, but overall I think this is ok.

contrib/elftoolchain/addr2line/addr2line.c
261 ↗(On Diff #68643)

I think this comment could be a bit more descriptive. :)

270 ↗(On Diff #68643)

I think this block can be written without a goto:

if (tag != DW_TAG_label) {
   <something>
        goto cont_search;
}

Also, what happens when we have a label without a low_pc attribute?

420 ↗(On Diff #68643)

I think a high-level description of what this function is doing would be useful.

tig_freebsdfoundation.org added inline comments.
contrib/elftoolchain/addr2line/addr2line.c
270 ↗(On Diff #68643)

If we have a label without a low_pc attribute, we should goto cont_search since we can't use the current label. I made it more explicit in this revision.

I think all the comments are addressed. Is there anything else that needs to be fixed? Thanks

contrib/elftoolchain/addr2line/addr2line.c
266 ↗(On Diff #68940)

Style: indentation should be by four spaces here.

425 ↗(On Diff #68940)

"die" should be written "DIE".

483 ↗(On Diff #68940)

There's no need to cast the return value of calloc() in C.

498 ↗(On Diff #68940)

Indentation by four spaces.

551 ↗(On Diff #68940)

I think these comments ("free memory", "return status", "add to cache", etc.) which describe *what* the code is doing are not useful. It's better to ensure that the high-level algorithm is documented, and to write comments that explain *why* the code is doing something when it's not clear.

tig_freebsdfoundation.org marked 5 inline comments as done.

Addressed the comments.

contrib/elftoolchain/addr2line/addr2line.c
59 ↗(On Diff #76817)

This s/CU/range/ is just for clarity? I'm curious about reordering so die and dbg are at the top now. (Sorry if I missed something in the earlier discussion.)

contrib/elftoolchain/addr2line/addr2line.c
59 ↗(On Diff #76817)

Yes, I think I asked for the rename since we no longer really assume that a CU can be represented by a single address range. I'm not sure about the purpose of reordering the fields.

tig_freebsdfoundation.org added inline comments.
contrib/elftoolchain/addr2line/addr2line.c
59 ↗(On Diff #76817)

I don't remember why I reordered the fields. I will revert it.

tig_freebsdfoundation.org retitled this revision from Add label checks when DW_AT_range and DW_AT_low_pc cannot be used to addr2line: add label checks when DW_AT_range and DW_AT_low_pc cannot be used.Oct 26 2020, 5:18 PM

Looks like this needs to be rebased?

Patching file contrib/elftoolchain/addr2line/addr2line.c using Plan A...
Hunk #1 succeeded at 57 (offset 1 line).
Hunk #2 succeeded at 89 (offset 1 line).
Hunk #3 succeeded at 171 (offset 1 line).
Hunk #4 succeeded at 179 (offset 1 line).
Hunk #5 succeeded at 216 (offset 1 line).
Hunk #6 succeeded at 238 (offset 1 line).
Hunk #7 succeeded at 258 (offset 1 line).
Hunk #8 succeeded at 330 (offset 1 line).
Hunk #9 succeeded at 341 (offset 1 line).
Hunk #10 succeeded at 351 (offset 1 line).
Hunk #11 succeeded at 365 (offset 1 line).
Hunk #12 succeeded at 397 (offset 1 line).
Hunk #13 failed at 421.
Hunk #14 succeeded at 575 (offset 2 lines).
Hunk #15 succeeded at 609 (offset 1 line).
Hunk #16 succeeded at 669 (offset 2 lines).
Hunk #17 succeeded at 682 (offset 1 line).
Hunk #18 succeeded at 725 (offset 2 lines).
Hunk #19 succeeded at 782 (offset 1 line).
Hunk #20 succeeded at 835 (offset 2 lines).
1 out of 20 hunks failed--saving rejects to contrib/elftoolchain/addr2line/addr2line.c.rej
done
tig_freebsdfoundation.org marked 4 inline comments as done.
tig_freebsdfoundation.org edited the test plan for this revision. (Show Details)
contrib/elftoolchain/addr2line/addr2line.c
792 ↗(On Diff #78913)

Is this die -> range->die an incidental fix?

contrib/elftoolchain/addr2line/addr2line.c
792 ↗(On Diff #78913)

It would seem so. I don't remember exactly the reason for this change but when translating an address and at label out, range->die will be set to the correct die to use either from a cache lookup or a linear scan. I believe either die and range->die are both ok.

Still not handling the test case from https://sourceforge.net/p/elftoolchain/tickets/552/

with this patch

$ usr.bin/addr2line/obj/addr2line -f -e ~/src/elftoolchain/test/elfdump/ts/dso2/test.so 0xa88
addr2line: dwarf_tag failed: Invalid argument [dwarf_tag(242)]
??
??:0

binutils:

$ /usr/local/bin/addr2line -f -e ~/src/elftoolchain/test/elfdump/ts/dso2/test.so 0xa88
_fini
/home/aurel32/debian/co-packages/glibc/etch/glibc-2.3.6.ds1/build-tree/amd64-libc/csu/crti.S:37

llvm

$ ~/src/llvm-project/build/bin/llvm-addr2line -f -e ~/src/elftoolchain/test/elfdump/ts/dso2/test.so 0xa88
_fini
/home/aurel32/debian/co-packages/glibc/etch/glibc-2.3.6.ds1/build-tree/amd64-libc/csu/crti.S:37

Initially, I thought _fini is a label in the test case. However, I found, using readelf, that _fini isn't a label in the test case ~/src/elftoolchain/test/elfdump/ts/dso2/test.so. I found it in the symbol table (symtab) instead. Currently, symtab is not checked by addr2line so in order to pass this test a new feature to check symtab is needed.
This patch does help resolve labels addr. For example, when _init is in a DW_TAG_label:

[tig@tiger /usr/home/tig/freebsd-1/lib]$ /usr/local/bin/addr2line -f -e /usr/lib/debug/usr/lib/libdwarf.so.4.debug 0x2edec
_init
/usr/src/lib/csu/amd64/crti.S:34
[tig@tiger /usr/home/tig/freebsd-1/lib]$ /usr/bin/addr2line -f -e /usr/lib/debug/usr/lib/libdwarf.so.4.debug 0x2edec
init
/usr/src/lib/csu/amd64/crti.S:34

Maybe we should commit this patch first and have the symtab support as future work?

Maybe we should commit this patch first and have the symtab support as future work?

Yes, I'll commit it soon. I had a few style/whitespace fixes that I rolled in to my WIP (e.g. function signatures over 80 cols).

contrib/elftoolchain/addr2line/addr2line.c
586 ↗(On Diff #78913)

Swapped these around in FreeBSD in rS367853

contrib/elftoolchain/addr2line/addr2line.c
617 ↗(On Diff #78913)

could we actualy return anything other than DW_DLV_OK or DW_DLV_NO_ENTERY before?

contrib/elftoolchain/addr2line/addr2line.c
617 ↗(On Diff #78913)

I believe this function has always been returning either DW_DLV_OK or DW_DLV_NO_ENTRY

This revision was not accepted when it landed; it landed in state Needs Review.Nov 27 2020, 9:38 PM
This revision was automatically updated to reflect the committed changes.