dtrace -G emits an object file containing a DOF file. The DOF file
records the locations of all the probe sites and is loaded into the
kernel by an ELF init function. When it comes time to enable a probe,
the fasttrap module uses the information in the DOF file to figure out
which addresses to overwrite with breakpoints.
Probe site locations have several components which get added together:
- address of the function containing the probe, relative to the load address of the containing ELF object
- offset of the probe site within the function (since a function may contain multiple probe sites)
- load address of the containing ELF object
Only (2) is known when dtrace -G runs. (3) is not known until runtime.
The above-mentioned ELF init function is responsible for providing it
when it registers the DOF file with the kernel. (1) is obtained using an
ELF relocation which references the function's symbol and is resolved at
static link time. However, the current implementation uses an incorrect
relocation type for this and only works because it exploits some
unspecified behaviour in the base system's GNU ld. It does not work with
newer binutils or lld, motivating this change.
The amd64 ABI does not define a relocation type that gives the value for
(1). The closest one is R_X86_64_64, but with GNU ld and lld this
relocation type will yield the absolute address of the function: it gets
converted to an R_X86_64_RELATIVE at static link time, and rtld finishes
it off. This means that we could combine (1) and (3), but also means
that USDT probes will pessimize application startup times by forcing the
use of dynamic relocations and a writeable .SUNW_dof section. Since USDT
is intended for use in low-level system libraries (libc, libthr), this
overhead is undesireable. Upstream code does in fact use R_X86_64_64,
but for some reason the Sun link editor resolves the relocations at
static link time (I verified this in SmartOS running under bhyve). I
believe that behaviour is incorrect, but not dig into it very much.
Another possibility, used in this change, is to instead obtain (1) using
PC-relative relocations. That is, for each probe site we emit an
R_X86_64_PC64 relocation against the function containing the probe.
Given (4), the absolute address of the relocation offset within the DOF
file, (1)+(2)+(4) then yields the absolute probe site address. Moreover,
the kernel's DOF registration code has (4) since it must copyin the DOF
file from the target process.
This change implements the approach described above for amd64 and i386.
It has two components, one in the kernel and one in the dtrace -G
implementation.
The kernel needs to be modified to handle (4) instead of (3). It uses
DOF relocations (i.e., a DOF-specific relocation format) to add (3) to
(1)+(2). A new DOF relocation type, DOF_RELO_DOFREL, is added so that
the kernel can use the DOF file address (4) instead, and libdtrace is
modified to emit these relocations instead of DOF_RELO_SETX.
The other modification is to emit symbol aliases for each function
containing a probe site. In order to fully resolve R_X86_64_PC64 at
static link time, the symbol's address must be known, but there are a
few possibilites that we must deal with:
- the symbol might be local,
- the symbol might be preemptible.
libdtrace already emits a global alias to deal with the first case. To
handle the second case, the code is modified to always emit an alias
(currently it just does so for local symbols), and we give the alias
hidden visibility so that it cannot be preempted. This allows the
PC-relative relocations to be resolved completely during the static
link of the application.
This change modifies handling for amd64 and i386 object files. I've
started directly modifying the upstream code rather than using
\#ifdef illumos/__FreeBSD__ everywhere: the code is starting to diverge
somewhat heavily and is difficult to read. The change also removes some
unused code for SPARC.