Used RB_TREE to store cu info.
Translate function logic:
First lookup in cache
Then lookup in tree. If success update cache.
Then brute force. If success insert in tree and cache
Details
Compared output with stock addr2line on 10000 random kernel addresses. The outputs are the same.
Compared performance with stock addr2line and gnu.
Memory usage on 10000 random addresses:
Tree: 77 MB
stock: 256 KB
GNU: 36 KB
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
contrib/elftoolchain/addr2line/addr2line.c | ||
---|---|---|
82 | Why not add a DIE field to struct CU above? Otherwise you are duplicating the lowpc/hipc fields. | |
430–431 | Do we need to keep the last_die cache anymore? Does it really help much? | |
518 | I think this structure never gets freed? Before that kind of made sense since we were putting it in an (unused) hash table, but now it does not. That said, I think it makes sense for struct node and struct CU to be merged, in which case you would not want to free the CUs since they end up in the lookup tree. | |
525 | I think we want to check for a match before doing any allocation, not after. | |
549 | This label name doesn't really make sense to me: the code which does the cache insertion comes before the label. | |
599 | On a cache hit, we will have cu == NULL here. Is that correct? Note that addr2line has -f and -i options that cause the (func || inlines) check to be true. Make sure that this case still works. | |
814 | This code should just be deleted. |
I fixed the issues according to the comments. We're using CU as tree node now and only allocate CU when addr is in the CU's range. The code looks much cleaner now.
contrib/elftoolchain/addr2line/addr2line.c | ||
---|---|---|
525 | Yes. We should just allocate cu here and add it to our tree. |
contrib/elftoolchain/addr2line/addr2line.c | ||
---|---|---|
125 | As a matter of style I would move this above the #define of USAGE_MESSAGE. USAGE_MESSAGE is used by usage() below. | |
418 | Extra newline. | |
432 | Per style(9) variable declarations should be at the beginning of the function. | |
804 | I don't really like the useless node. Instead of using it, can't we modify the lookup algorithm slightly to look at RB_LAST() if RB_NFIND() returns NULL? |
I removed the useless node and now we check max node if nfind returns null. I guess useless node was indeed useless lol.
Updated. Current code is passing my tests.
contrib/elftoolchain/addr2line/addr2line.c | ||
---|---|---|
545 | I think we do because tree lookup is O(n) and it's O(1) if it's a single cache HIT. This is good for sequential lookups. |
contrib/elftoolchain/addr2line/addr2line.c | ||
---|---|---|
545 | I guess my real question is, does it make a difference in practice? That is, can you measure a significant benefit from it when input is sorted? I don't object to keeping it if so, but the goal here is really just to ensure that addr2line is not abysmally slow. A 1% improvement is not very important and not worth the extra complexity in my opinion. |
contrib/elftoolchain/addr2line/addr2line.c | ||
---|---|---|
545 | The performance is basically exactly the same after I removed the single cache. I removed it. |
contrib/elftoolchain/addr2line/addr2line.c | ||
---|---|---|
545 | The performance is similar in both cases (sequential addresses and random addresses). |