Page MenuHomeFreeBSD

librtld_db: Handle shlibs with discontiguous mappings.
ClosedPublic

Authored by chs on Nov 11 2021, 9:48 PM.

Details

Summary

I was trying to use dtrace ustack() to log all of the malloc/free calls in
an SSL-using application (nginx), and I found that many of the stack traces
were missing library and function names for lots of the stack frames.
The same thing happens with the much simpler to use "openssl" command,
here is an example from that:

CPU ID FUNCTION:NAME

1  68835                     malloc:entry 
            libc.so.7`malloc+0x1
            `0x8012c4fd3
            `0x8012f8043
            `0x8012fcc9c
            openssl`main+0xd2
            openssl`_start+0x10d
            `0x8010f1008

It turned out that the mystery library is libcrypto.so, and here is how
that library is mapped into the openssl command address space:

2926 0x8011a9000 0x8012c2000 r-- 281 2408 17 0 CN--- vn /lib/libcrypto.so.111
2926 0x8012c2000 0x8012c3000 --- 0 0 0 0 CN--- gd
2926 0x8012c3000 0x80146b000 r-x 424 2408 17 0 CN--- vn /lib/libcrypto.so.111
2926 0x80146b000 0x801496000 r-- 43 0 1 0 C---- vn /lib/libcrypto.so.111
2926 0x801496000 0x80149c000 rw- 6 0 1 0 C---- vn /lib/libcrypto.so.111

The "gd" guard page left in the middle of the file mappings is a bit odd,
but it looks legitimate based on the ELF headers of the library:

/lib/libcrypto.so.111: file format elf64-x86-64-freebsd

Program Header:

PHDR off    0x0000000000000040 vaddr 0x0000000000000040 paddr 0x0000000000000040 align 2**3
     filesz 0x0000000000000230 memsz 0x0000000000000230 flags r--
LOAD off    0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**12
     filesz 0x00000000001185b4 memsz 0x00000000001185b4 flags r--
LOAD off    0x0000000000119000 vaddr 0x000000000011a000 paddr 0x000000000011a000 align 2**12
     filesz 0x00000000001a7bc0 memsz 0x00000000001a7bc0 flags r-x
LOAD off    0x00000000002c0bc0 vaddr 0x00000000002c2bc0 paddr 0x00000000002c2bc0 align 2**12
     filesz 0x0000000000029678 memsz 0x0000000000029678 flags rw-
LOAD off    0x00000000002ea240 vaddr 0x00000000002ed240 paddr 0x00000000002ed240 align 2**12
     filesz 0x0000000000005d20 memsz 0x0000000000009400 flags rw-

The ELF phdrs do not map anything at vaddr 0x0000000000119000,
so that bit of the guard mapping that reserves the space for a
shared library is still there after the library is fully mapped.

The problem here is in librtld_db's rd_loadobj_iter(), where we translate
the kernel's mapping information obtained from kinfo_getvmmap() into
rtld_db's rd_loadobj_t records, which is then later converted by libproc's
proc_rdagent() and map_iter() into libproc's struct map_info records,
where the information is eventually used by proc_addr2sym() called from
libdtrace's dt_print_ustack(). There are two related bugs in
this code in rd_loadobj_iter() that are triggered by the guard page
that is left in between the first two libcrypto.so mappings:

if (kve->kve_type == KVME_TYPE_VNODE) {
        if (kve->kve_vn_fileid != fileid) {
                base = kve->kve_start;
                fileid = kve->kve_vn_fileid;
                path = kve->kve_path;
        }
} else {
        base = 0;
        path = NULL;
}

(1) we need to set "path" for all vnode mappings, not just ones

that are in an uninterrupted run after the initial mapping of
a new fileid.

(2) we should not reset "base" to 0 when we see a non-vnode mapping,

because more vnode mappings for the same fileid might still be
coming later.

With the patch in this review, dtrace ustack() now shows the library
and function names for the libcrypto functions:

CPU ID FUNCTION:NAME

3  68835                     malloc:entry 
            libc.so.7`malloc+0x1
            libcrypto.so.111`CRYPTO_zalloc+0x43
            libcrypto.so.111`BIO_new+0x23
            libcrypto.so.111`BIO_new_fp+0x1c
            openssl`main+0xd2
            openssl`_start+0x10d
            `0x8010f1008

The tests in /usr/tests/lib/libproc also continue to pass with this patch.

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

chs requested review of this revision.Nov 11 2021, 9:48 PM
chs added a reviewer: markj.

Thanks for the detailed explanation. I think this patch is right, I have just one comment. I can't quickly see why we have the gap in libcrypto.so, there's no relevant linker options being specified that I can see.

lib/librtld_db/rtld_db.c
207

This value ends up being nonsensical for non-vnode mappings. Can we also change the loop so that rdl_offset is 0 for such mappings?

This revision is now accepted and ready to land.Nov 12 2021, 8:36 PM

update for review comments.

This revision now requires review to proceed.Dec 3 2021, 3:18 PM
chs marked an inline comment as done.Dec 3 2021, 3:22 PM

Yea, I couldn't see any reason for the linker to leave the gap in libcrypto.so's memory layout either, but it doesn't seem like an invalid thing to do either, and it's easy enough to handle it.

This revision is now accepted and ready to land.Dec 3 2021, 3:31 PM