Index: vendor/llvm/dist/docs/ReleaseNotes.rst
===================================================================
--- vendor/llvm/dist/docs/ReleaseNotes.rst (revision 314410)
+++ vendor/llvm/dist/docs/ReleaseNotes.rst (revision 314411)
@@ -1,278 +1,314 @@
========================
LLVM 4.0.0 Release Notes
========================
.. contents::
:local:
Introduction
============
This document contains the release notes for the LLVM Compiler Infrastructure,
release 4.0.0. Here we describe the status of LLVM, including major improvements
from the previous release, improvements in various subprojects of LLVM, and
some of the current users of the code. All LLVM releases may be downloaded
from the `LLVM releases web site `_.
For more information about LLVM, including information about the latest
release, please check out the `main LLVM web site `_. If you
have questions or comments, the `LLVM Developer's Mailing List
`_ is a good place to send
them.
+New Versioning Scheme
+=====================
+Starting with this release, LLVM is using a
+`new versioning scheme `_,
+increasing the major version number with each major release. Stable updates to
+this release will be versioned 4.0.x, and the next major release, six months
+from now, will be version 5.0.0.
+
Non-comprehensive list of changes in this release
=================================================
* Minimum compiler version to build has been raised to GCC 4.8 and VS 2015.
* The C API functions ``LLVMAddFunctionAttr``, ``LLVMGetFunctionAttr``,
``LLVMRemoveFunctionAttr``, ``LLVMAddAttribute``, ``LLVMRemoveAttribute``,
``LLVMGetAttribute``, ``LLVMAddInstrAttribute`` and
``LLVMRemoveInstrAttribute`` have been removed.
* The C API enum ``LLVMAttribute`` has been deleted.
* The definition and uses of ``LLVM_ATRIBUTE_UNUSED_RESULT`` in the LLVM source
were replaced with ``LLVM_NODISCARD``, which matches the C++17 ``[[nodiscard]]``
semantics rather than gcc's ``__attribute__((warn_unused_result))``.
* The Timer related APIs now expect a Name and Description. When upgrading code
the previously used names should become descriptions and a short name in the
style of a programming language identifier should be added.
* LLVM now handles ``invariant.group`` across different basic blocks, which makes
it possible to devirtualize virtual calls inside loops.
* The aggressive dead code elimination phase ("adce") now removes
branches which do not effect program behavior. Loops are retained by
default since they may be infinite but these can also be removed
with LLVM option ``-adce-remove-loops`` when the loop body otherwise has
no live operations.
* The GVNHoist pass is now enabled by default. The new pass based on Global
Value Numbering detects similar computations in branch code and replaces
multiple instances of the same computation with a unique expression. The
transform benefits code size and generates better schedules. GVNHoist is
more aggressive at ``-Os`` and ``-Oz``, hoisting more expressions at the
expense of execution time degradations.
* The llvm-cov tool can now export coverage data as json. Its html output mode
has also improved.
Improvements to ThinLTO (-flto=thin)
------------------------------------
Integration with profile data (PGO). When available, profile data
enables more accurate function importing decisions, as well as
cross-module indirect call promotion.
Significant build-time and binary-size improvements when compiling with
debug info (-g).
LLVM Coroutines
---------------
Experimental support for :doc:`Coroutines` was added, which can be enabled
with ``-enable-coroutines`` in ``opt`` the command tool or using the
``addCoroutinePassesToExtensionPoints`` API when building the optimization
pipeline.
For more information on LLVM Coroutines and the LLVM implementation, see
`2016 LLVM Developers’ Meeting talk on LLVM Coroutines
`_.
Regcall and Vectorcall Calling Conventions
--------------------------------------------------
Support was added for ``_regcall`` calling convention.
Existing ``__vectorcall`` calling convention support was extended to include
correct handling of HVAs.
The ``__vectorcall`` calling convention was introduced by Microsoft to
enhance register usage when passing parameters.
For more information please read `__vectorcall documentation
`_.
The ``__regcall`` calling convention was introduced by Intel to
optimize parameter transfer on function call.
This calling convention ensures that as many values as possible are
passed or returned in registers.
For more information please read `__regcall documentation
`_.
Code Generation Testing
-----------------------
Passes that work on the machine instruction representation can be tested with
the .mir serialization format. ``llc`` supports the ``-run-pass``,
``-stop-after``, ``-stop-before``, ``-start-after``, ``-start-before`` to
run a single pass of the code generation pipeline, or to stop or start the code
generation pipeline at a given point.
Additional information can be found in the :doc:`MIRLangRef`. The format is
used by the tests ending in ``.mir`` in the ``test/CodeGen`` directory.
This feature is available since 2015. It is used more often lately and was not
mentioned in the release notes yet.
Intrusive list API overhaul
---------------------------
The intrusive list infrastructure was substantially rewritten over the last
couple of releases, primarily to excise undefined behaviour. The biggest
changes landed in this release.
* ``simple_ilist`` is a lower-level intrusive list that never takes
ownership of its nodes. New intrusive-list clients should consider using it
instead of ``ilist``.
* ``ilist_tag`` allows a single data type to be inserted into two
parallel intrusive lists. A type can inherit twice from ``ilist_node``,
first using ``ilist_node>`` (enabling insertion into
``simple_ilist>``) and second using
``ilist_node>`` (enabling insertion into
``simple_ilist>``), where ``A`` and ``B`` are arbitrary
types.
* ``ilist_sentinel_tracking`` controls whether an iterator knows
whether it's pointing at the sentinel (``end()``). By default, sentinel
tracking is on when ABI-breaking checks are enabled, and off otherwise;
this is used for an assertion when dereferencing ``end()`` (this assertion
triggered often in practice, and many backend bugs were fixed). Explicitly
turning on sentinel tracking also enables ``iterator::isEnd()``. This is
used by ``MachineInstrBundleIterator`` to iterate over bundles.
* ``ilist`` is built on top of ``simple_ilist``, and supports the same
configuration options. As before (and unlike ``simple_ilist``),
``ilist`` takes ownership of its nodes. However, it no longer supports
*allocating* nodes, and is now equivalent to ``iplist``. ``iplist``
will likely be removed in the future.
* ``ilist`` now always uses ``ilist_traits``. Instead of passing a
custom traits class in via a template parameter, clients that want to
customize the traits should specialize ``ilist_traits``. Clients that
want to avoid ownership can specialize ``ilist_alloc_traits`` to inherit
from ``ilist_noalloc_traits`` (or to do something funky); clients that
need callbacks can specialize ``ilist_callback_traits`` directly.
* The underlying data structure is now a simple recursive linked list. The
sentinel node contains only a "next" (``begin()``) and "prev" (``rbegin()``)
pointer and is stored in the same allocation as ``simple_ilist``.
Previously, it was malloc-allocated on-demand by default, although the
now-defunct ``ilist_sentinel_traits`` was sometimes specialized to avoid
this.
* The ``reverse_iterator`` class no longer uses ``std::reverse_iterator``.
Instead, it now has a handle to the same node that it dereferences to.
Reverse iterators now have the same iterator invalidation semantics as
forward iterators.
* ``iterator`` and ``reverse_iterator`` have explicit conversion constructors
that match ``std::reverse_iterator``'s off-by-one semantics, so that
reversing the end points of an iterator range results in the same range
(albeit in reverse). I.e., ``reverse_iterator(begin())`` equals
``rend()``.
* ``iterator::getReverse()`` and ``reverse_iterator::getReverse()`` return an
iterator that dereferences to the *same* node. I.e.,
``begin().getReverse()`` equals ``--rend()``.
* ``ilist_node::getIterator()`` and
``ilist_node::getReverseIterator()`` return the forward and reverse
iterators that dereference to the current node. I.e.,
``begin()->getIterator()`` equals ``begin()`` and
``rbegin()->getReverseIterator()`` equals ``rbegin()``.
* ``iterator`` now stores an ``ilist_node_base*`` instead of a ``T*``. The
implicit conversions between ``ilist::iterator`` and ``T*`` have been
removed. Clients may use ``N->getIterator()`` (if not ``nullptr``) or
``&*I`` (if not ``end()``); alternatively, clients may refactor to use
references for known-good nodes.
Changes to the ARM Targets
--------------------------
**During this release the AArch64 target has:**
* Gained support for ILP32 relocations.
* Gained support for XRay.
* Made even more progress on GlobalISel. There is still some work left before
it is production-ready though.
* Refined the support for Qualcomm's Falkor and Samsung's Exynos CPUs.
* Learned a few new tricks for lowering multiplications by constants, folding
spilled/refilled copies etc.
**During this release the ARM target has:**
* Gained support for ROPI (read-only position independence) and RWPI
(read-write position independence), which can be used to remove the need for
a dynamic linker.
* Gained support for execute-only code, which is placed in pages without read
permissions.
* Gained a machine scheduler for Cortex-R52.
* Gained support for XRay.
* Gained Thumb1 implementations for several compiler-rt builtins. It also
has some support for building the builtins for HF targets.
* Started using the generic bitreverse intrinsic instead of rbit.
* Gained very basic support for GlobalISel.
A lot of work has also been done in LLD for ARM, which now supports more
relocations and TLS.
Changes to the AVR Target
-----------------------------
This marks the first release where the AVR backend has been completely merged
from a fork into LLVM trunk. The backend is still marked experimental, but
is generally quite usable. All downstream development has halted on
`GitHub `_, and changes now go directly into
LLVM trunk.
* Instruction selector and pseudo instruction expansion pass landed
* `read_register` and `write_register` intrinsics are now supported
* Support stack stores greater than 63-bytes from the bottom of the stack
* A number of assertion errors have been fixed
* Support stores to `undef` locations
* Very basic support for the target has been added to clang
* Small optimizations to some 16-bit boolean expressions
Most of the work behind the scenes has been on correctness of generated
assembly, and also fixing some assertions we would hit on some well-formed
inputs.
+
+Changes to the MIPS Target
+-----------------------------
+
+**During this release the MIPS target has:**
+
+* IAS is now enabled by default for Debian mips64el.
+* Added support for the two operand form for many instructions.
+* Added the following macros: unaligned load/store, seq, double word load/store for O32.
+* Improved the parsing of complex memory offset expressions.
+* Enabled the integrated assembler by default for Debian mips64el.
+* Added a generic scheduler based on the interAptiv CPU.
+* Added support for thread local relocations.
+* Added recip, rsqrt, evp, dvp, synci instructions in IAS.
+* Optimized the generation of constants from some cases.
+
+**The following issues have been fixed:**
+
+* Thread local debug information is correctly recorded.
+* MSA intrinsics are now range checked.
+* Fixed an issue with MSA and the no-odd-spreg abi.
+* Fixed some corner cases in handling forbidden slots for MIPSR6.
+* Fixed an issue with jumps not being converted to relative branches for assembly.
+* Fixed the handling of local symbols and jal instruction.
+* N32/N64 no longer have their relocation tables sorted as per their ABIs.
+* Fixed a crash when half-precision floating point conversion MSA intrinsics are used.
+* Fixed several crashes involving FastISel.
+* Corrected the corrected definitions for aui/daui/dahi/dati for MIPSR6.
Changes to the OCaml bindings
-----------------------------
* The attribute API was completely overhauled, following the changes
to the C API.
External Open Source Projects Using LLVM 4.0.0
==============================================
LDC - the LLVM-based D compiler
-------------------------------
`D `_ is a language with C-like syntax and static typing. It
pragmatically combines efficiency, control, and modeling power, with safety and
programmer productivity. D supports powerful concepts like Compile-Time Function
Execution (CTFE) and Template Meta-Programming, provides an innovative approach
to concurrency and offers many classical paradigms.
`LDC `_ uses the frontend from the reference compiler
combined with LLVM as backend to produce efficient native code. LDC targets
x86/x86_64 systems like Linux, OS X, FreeBSD and Windows and also Linux on ARM
and PowerPC (32/64 bit). Ports to other architectures like AArch64 and MIPS64
are underway.
Additional Information
======================
A wide variety of additional information is available on the `LLVM web page
`_, in particular in the `documentation
`_ section. The web page also contains versions of the
API documentation which is up-to-date with the Subversion version of the source
code. You can access versions of these documents specific to this release by
going into the ``llvm/docs/`` directory in the LLVM tree.
If you have any questions or comments about LLVM, please feel free to contact
us via the `mailing lists `_.
Index: vendor/llvm/dist/lib/CodeGen/ExecutionDepsFix.cpp
===================================================================
--- vendor/llvm/dist/lib/CodeGen/ExecutionDepsFix.cpp (revision 314410)
+++ vendor/llvm/dist/lib/CodeGen/ExecutionDepsFix.cpp (revision 314411)
@@ -1,867 +1,861 @@
//===- ExecutionDepsFix.cpp - Fix execution dependecy issues ----*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// This file contains the execution dependency fix pass.
//
// Some X86 SSE instructions like mov, and, or, xor are available in different
// variants for different operand types. These variant instructions are
// equivalent, but on Nehalem and newer cpus there is extra latency
// transferring data between integer and floating point domains. ARM cores
// have similar issues when they are configured with both VFP and NEON
// pipelines.
//
// This pass changes the variant instructions to minimize domain crossings.
//
//===----------------------------------------------------------------------===//
#include "llvm/CodeGen/Passes.h"
#include "llvm/ADT/PostOrderIterator.h"
#include "llvm/ADT/iterator_range.h"
#include "llvm/CodeGen/LivePhysRegs.h"
#include "llvm/CodeGen/MachineFunctionPass.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"
#include "llvm/CodeGen/RegisterClassInfo.h"
#include "llvm/Support/Allocator.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/Target/TargetInstrInfo.h"
#include "llvm/Target/TargetSubtargetInfo.h"
using namespace llvm;
#define DEBUG_TYPE "execution-fix"
/// A DomainValue is a bit like LiveIntervals' ValNo, but it also keeps track
/// of execution domains.
///
/// An open DomainValue represents a set of instructions that can still switch
/// execution domain. Multiple registers may refer to the same open
/// DomainValue - they will eventually be collapsed to the same execution
/// domain.
///
/// A collapsed DomainValue represents a single register that has been forced
/// into one of more execution domains. There is a separate collapsed
/// DomainValue for each register, but it may contain multiple execution
/// domains. A register value is initially created in a single execution
/// domain, but if we were forced to pay the penalty of a domain crossing, we
/// keep track of the fact that the register is now available in multiple
/// domains.
namespace {
struct DomainValue {
// Basic reference counting.
unsigned Refs;
// Bitmask of available domains. For an open DomainValue, it is the still
// possible domains for collapsing. For a collapsed DomainValue it is the
// domains where the register is available for free.
unsigned AvailableDomains;
// Pointer to the next DomainValue in a chain. When two DomainValues are
// merged, Victim.Next is set to point to Victor, so old DomainValue
// references can be updated by following the chain.
DomainValue *Next;
// Twiddleable instructions using or defining these registers.
SmallVector Instrs;
// A collapsed DomainValue has no instructions to twiddle - it simply keeps
// track of the domains where the registers are already available.
bool isCollapsed() const { return Instrs.empty(); }
// Is domain available?
bool hasDomain(unsigned domain) const {
assert(domain <
static_cast(std::numeric_limits::digits) &&
"undefined behavior");
return AvailableDomains & (1u << domain);
}
// Mark domain as available.
void addDomain(unsigned domain) {
AvailableDomains |= 1u << domain;
}
// Restrict to a single domain available.
void setSingleDomain(unsigned domain) {
AvailableDomains = 1u << domain;
}
// Return bitmask of domains that are available and in mask.
unsigned getCommonDomains(unsigned mask) const {
return AvailableDomains & mask;
}
// First domain available.
unsigned getFirstDomain() const {
return countTrailingZeros(AvailableDomains);
}
DomainValue() : Refs(0) { clear(); }
// Clear this DomainValue and point to next which has all its data.
void clear() {
AvailableDomains = 0;
Next = nullptr;
Instrs.clear();
}
};
}
namespace {
/// Information about a live register.
struct LiveReg {
/// Value currently in this register, or NULL when no value is being tracked.
/// This counts as a DomainValue reference.
DomainValue *Value;
/// Instruction that defined this register, relative to the beginning of the
/// current basic block. When a LiveReg is used to represent a live-out
/// register, this value is relative to the end of the basic block, so it
/// will be a negative number.
int Def;
};
} // anonymous namespace
namespace {
class ExeDepsFix : public MachineFunctionPass {
static char ID;
SpecificBumpPtrAllocator Allocator;
SmallVector Avail;
const TargetRegisterClass *const RC;
MachineFunction *MF;
const TargetInstrInfo *TII;
const TargetRegisterInfo *TRI;
RegisterClassInfo RegClassInfo;
std::vector> AliasMap;
const unsigned NumRegs;
LiveReg *LiveRegs;
typedef DenseMap LiveOutMap;
LiveOutMap LiveOuts;
/// List of undefined register reads in this block in forward order.
std::vector > UndefReads;
/// Storage for register unit liveness.
LivePhysRegs LiveRegSet;
/// Current instruction number.
/// The first instruction in each basic block is 0.
int CurInstr;
/// True when the current block has a predecessor that hasn't been visited
/// yet.
bool SeenUnknownBackEdge;
public:
ExeDepsFix(const TargetRegisterClass *rc)
: MachineFunctionPass(ID), RC(rc), NumRegs(RC->getNumRegs()) {}
void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.setPreservesAll();
MachineFunctionPass::getAnalysisUsage(AU);
}
bool runOnMachineFunction(MachineFunction &MF) override;
MachineFunctionProperties getRequiredProperties() const override {
return MachineFunctionProperties().set(
MachineFunctionProperties::Property::NoVRegs);
}
StringRef getPassName() const override { return "Execution dependency fix"; }
private:
iterator_range::const_iterator>
regIndices(unsigned Reg) const;
// DomainValue allocation.
DomainValue *alloc(int domain = -1);
DomainValue *retain(DomainValue *DV) {
if (DV) ++DV->Refs;
return DV;
}
void release(DomainValue*);
DomainValue *resolve(DomainValue*&);
// LiveRegs manipulations.
void setLiveReg(int rx, DomainValue *DV);
void kill(int rx);
void force(int rx, unsigned domain);
void collapse(DomainValue *dv, unsigned domain);
bool merge(DomainValue *A, DomainValue *B);
void enterBasicBlock(MachineBasicBlock*);
void leaveBasicBlock(MachineBasicBlock*);
void visitInstr(MachineInstr*);
void processDefs(MachineInstr*, bool Kill);
void visitSoftInstr(MachineInstr*, unsigned mask);
void visitHardInstr(MachineInstr*, unsigned domain);
void pickBestRegisterForUndef(MachineInstr *MI, unsigned OpIdx,
unsigned Pref);
bool shouldBreakDependence(MachineInstr*, unsigned OpIdx, unsigned Pref);
void processUndefReads(MachineBasicBlock*);
};
}
char ExeDepsFix::ID = 0;
/// Translate TRI register number to a list of indices into our smaller tables
/// of interesting registers.
iterator_range::const_iterator>
ExeDepsFix::regIndices(unsigned Reg) const {
assert(Reg < AliasMap.size() && "Invalid register");
const auto &Entry = AliasMap[Reg];
return make_range(Entry.begin(), Entry.end());
}
DomainValue *ExeDepsFix::alloc(int domain) {
DomainValue *dv = Avail.empty() ?
new(Allocator.Allocate()) DomainValue :
Avail.pop_back_val();
if (domain >= 0)
dv->addDomain(domain);
assert(dv->Refs == 0 && "Reference count wasn't cleared");
assert(!dv->Next && "Chained DomainValue shouldn't have been recycled");
return dv;
}
/// Release a reference to DV. When the last reference is released,
/// collapse if needed.
void ExeDepsFix::release(DomainValue *DV) {
while (DV) {
assert(DV->Refs && "Bad DomainValue");
if (--DV->Refs)
return;
// There are no more DV references. Collapse any contained instructions.
if (DV->AvailableDomains && !DV->isCollapsed())
collapse(DV, DV->getFirstDomain());
DomainValue *Next = DV->Next;
DV->clear();
Avail.push_back(DV);
// Also release the next DomainValue in the chain.
DV = Next;
}
}
/// Follow the chain of dead DomainValues until a live DomainValue is reached.
/// Update the referenced pointer when necessary.
DomainValue *ExeDepsFix::resolve(DomainValue *&DVRef) {
DomainValue *DV = DVRef;
if (!DV || !DV->Next)
return DV;
// DV has a chain. Find the end.
do DV = DV->Next;
while (DV->Next);
// Update DVRef to point to DV.
retain(DV);
release(DVRef);
DVRef = DV;
return DV;
}
/// Set LiveRegs[rx] = dv, updating reference counts.
void ExeDepsFix::setLiveReg(int rx, DomainValue *dv) {
assert(unsigned(rx) < NumRegs && "Invalid index");
assert(LiveRegs && "Must enter basic block first.");
if (LiveRegs[rx].Value == dv)
return;
if (LiveRegs[rx].Value)
release(LiveRegs[rx].Value);
LiveRegs[rx].Value = retain(dv);
}
// Kill register rx, recycle or collapse any DomainValue.
void ExeDepsFix::kill(int rx) {
assert(unsigned(rx) < NumRegs && "Invalid index");
assert(LiveRegs && "Must enter basic block first.");
if (!LiveRegs[rx].Value)
return;
release(LiveRegs[rx].Value);
LiveRegs[rx].Value = nullptr;
}
/// Force register rx into domain.
void ExeDepsFix::force(int rx, unsigned domain) {
assert(unsigned(rx) < NumRegs && "Invalid index");
assert(LiveRegs && "Must enter basic block first.");
if (DomainValue *dv = LiveRegs[rx].Value) {
if (dv->isCollapsed())
dv->addDomain(domain);
else if (dv->hasDomain(domain))
collapse(dv, domain);
else {
// This is an incompatible open DomainValue. Collapse it to whatever and
// force the new value into domain. This costs a domain crossing.
collapse(dv, dv->getFirstDomain());
assert(LiveRegs[rx].Value && "Not live after collapse?");
LiveRegs[rx].Value->addDomain(domain);
}
} else {
// Set up basic collapsed DomainValue.
setLiveReg(rx, alloc(domain));
}
}
/// Collapse open DomainValue into given domain. If there are multiple
/// registers using dv, they each get a unique collapsed DomainValue.
void ExeDepsFix::collapse(DomainValue *dv, unsigned domain) {
assert(dv->hasDomain(domain) && "Cannot collapse");
// Collapse all the instructions.
while (!dv->Instrs.empty())
TII->setExecutionDomain(*dv->Instrs.pop_back_val(), domain);
dv->setSingleDomain(domain);
// If there are multiple users, give them new, unique DomainValues.
if (LiveRegs && dv->Refs > 1)
for (unsigned rx = 0; rx != NumRegs; ++rx)
if (LiveRegs[rx].Value == dv)
setLiveReg(rx, alloc(domain));
}
/// All instructions and registers in B are moved to A, and B is released.
bool ExeDepsFix::merge(DomainValue *A, DomainValue *B) {
assert(!A->isCollapsed() && "Cannot merge into collapsed");
assert(!B->isCollapsed() && "Cannot merge from collapsed");
if (A == B)
return true;
// Restrict to the domains that A and B have in common.
unsigned common = A->getCommonDomains(B->AvailableDomains);
if (!common)
return false;
A->AvailableDomains = common;
A->Instrs.append(B->Instrs.begin(), B->Instrs.end());
// Clear the old DomainValue so we won't try to swizzle instructions twice.
B->clear();
// All uses of B are referred to A.
B->Next = retain(A);
for (unsigned rx = 0; rx != NumRegs; ++rx) {
assert(LiveRegs && "no space allocated for live registers");
if (LiveRegs[rx].Value == B)
setLiveReg(rx, A);
}
return true;
}
/// Set up LiveRegs by merging predecessor live-out values.
void ExeDepsFix::enterBasicBlock(MachineBasicBlock *MBB) {
// Detect back-edges from predecessors we haven't processed yet.
SeenUnknownBackEdge = false;
// Reset instruction counter in each basic block.
CurInstr = 0;
// Set up UndefReads to track undefined register reads.
UndefReads.clear();
LiveRegSet.clear();
// Set up LiveRegs to represent registers entering MBB.
if (!LiveRegs)
LiveRegs = new LiveReg[NumRegs];
// Default values are 'nothing happened a long time ago'.
for (unsigned rx = 0; rx != NumRegs; ++rx) {
LiveRegs[rx].Value = nullptr;
LiveRegs[rx].Def = -(1 << 20);
}
// This is the entry block.
if (MBB->pred_empty()) {
for (const auto &LI : MBB->liveins()) {
for (int rx : regIndices(LI.PhysReg)) {
// Treat function live-ins as if they were defined just before the first
// instruction. Usually, function arguments are set up immediately
// before the call.
LiveRegs[rx].Def = -1;
}
}
DEBUG(dbgs() << "BB#" << MBB->getNumber() << ": entry\n");
return;
}
// Try to coalesce live-out registers from predecessors.
for (MachineBasicBlock::const_pred_iterator pi = MBB->pred_begin(),
pe = MBB->pred_end(); pi != pe; ++pi) {
LiveOutMap::const_iterator fi = LiveOuts.find(*pi);
if (fi == LiveOuts.end()) {
SeenUnknownBackEdge = true;
continue;
}
assert(fi->second && "Can't have NULL entries");
for (unsigned rx = 0; rx != NumRegs; ++rx) {
// Use the most recent predecessor def for each register.
LiveRegs[rx].Def = std::max(LiveRegs[rx].Def, fi->second[rx].Def);
DomainValue *pdv = resolve(fi->second[rx].Value);
if (!pdv)
continue;
if (!LiveRegs[rx].Value) {
setLiveReg(rx, pdv);
continue;
}
// We have a live DomainValue from more than one predecessor.
if (LiveRegs[rx].Value->isCollapsed()) {
// We are already collapsed, but predecessor is not. Force it.
unsigned Domain = LiveRegs[rx].Value->getFirstDomain();
if (!pdv->isCollapsed() && pdv->hasDomain(Domain))
collapse(pdv, Domain);
continue;
}
// Currently open, merge in predecessor.
if (!pdv->isCollapsed())
merge(LiveRegs[rx].Value, pdv);
else
force(rx, pdv->getFirstDomain());
}
}
DEBUG(dbgs() << "BB#" << MBB->getNumber()
<< (SeenUnknownBackEdge ? ": incomplete\n" : ": all preds known\n"));
}
void ExeDepsFix::leaveBasicBlock(MachineBasicBlock *MBB) {
assert(LiveRegs && "Must enter basic block first.");
// Save live registers at end of MBB - used by enterBasicBlock().
// Also use LiveOuts as a visited set to detect back-edges.
bool First = LiveOuts.insert(std::make_pair(MBB, LiveRegs)).second;
if (First) {
// LiveRegs was inserted in LiveOuts. Adjust all defs to be relative to
// the end of this block instead of the beginning.
for (unsigned i = 0, e = NumRegs; i != e; ++i)
LiveRegs[i].Def -= CurInstr;
} else {
// Insertion failed, this must be the second pass.
// Release all the DomainValues instead of keeping them.
for (unsigned i = 0, e = NumRegs; i != e; ++i)
release(LiveRegs[i].Value);
delete[] LiveRegs;
}
LiveRegs = nullptr;
}
void ExeDepsFix::visitInstr(MachineInstr *MI) {
if (MI->isDebugValue())
return;
// Update instructions with explicit execution domains.
std::pair DomP = TII->getExecutionDomain(*MI);
if (DomP.first) {
if (DomP.second)
visitSoftInstr(MI, DomP.second);
else
visitHardInstr(MI, DomP.first);
}
// Process defs to track register ages, and kill values clobbered by generic
// instructions.
processDefs(MI, !DomP.first);
}
/// \brief Helps avoid false dependencies on undef registers by updating the
/// machine instructions' undef operand to use a register that the instruction
/// is truly dependent on, or use a register with clearance higher than Pref.
void ExeDepsFix::pickBestRegisterForUndef(MachineInstr *MI, unsigned OpIdx,
unsigned Pref) {
MachineOperand &MO = MI->getOperand(OpIdx);
assert(MO.isUndef() && "Expected undef machine operand");
unsigned OriginalReg = MO.getReg();
// Update only undef operands that are mapped to one register.
if (AliasMap[OriginalReg].size() != 1)
return;
// Get the undef operand's register class
const TargetRegisterClass *OpRC =
TII->getRegClass(MI->getDesc(), OpIdx, TRI, *MF);
// If the instruction has a true dependency, we can hide the false depdency
// behind it.
for (MachineOperand &CurrMO : MI->operands()) {
if (!CurrMO.isReg() || CurrMO.isDef() || CurrMO.isUndef() ||
!OpRC->contains(CurrMO.getReg()))
continue;
// We found a true dependency - replace the undef register with the true
// dependency.
MO.setReg(CurrMO.getReg());
return;
}
// Go over all registers in the register class and find the register with
// max clearance or clearance higher than Pref.
unsigned MaxClearance = 0;
unsigned MaxClearanceReg = OriginalReg;
ArrayRef Order = RegClassInfo.getOrder(OpRC);
for (auto Reg : Order) {
assert(AliasMap[Reg].size() == 1 &&
"Reg is expected to be mapped to a single index");
int RCrx = *regIndices(Reg).begin();
unsigned Clearance = CurInstr - LiveRegs[RCrx].Def;
if (Clearance <= MaxClearance)
continue;
MaxClearance = Clearance;
MaxClearanceReg = Reg;
if (MaxClearance > Pref)
break;
}
// Update the operand if we found a register with better clearance.
if (MaxClearanceReg != OriginalReg)
MO.setReg(MaxClearanceReg);
}
/// \brief Return true to if it makes sense to break dependence on a partial def
/// or undef use.
bool ExeDepsFix::shouldBreakDependence(MachineInstr *MI, unsigned OpIdx,
unsigned Pref) {
unsigned reg = MI->getOperand(OpIdx).getReg();
for (int rx : regIndices(reg)) {
unsigned Clearance = CurInstr - LiveRegs[rx].Def;
DEBUG(dbgs() << "Clearance: " << Clearance << ", want " << Pref);
if (Pref > Clearance) {
DEBUG(dbgs() << ": Break dependency.\n");
continue;
}
// The current clearance seems OK, but we may be ignoring a def from a
// back-edge.
if (!SeenUnknownBackEdge || Pref <= unsigned(CurInstr)) {
DEBUG(dbgs() << ": OK .\n");
return false;
}
// A def from an unprocessed back-edge may make us break this dependency.
DEBUG(dbgs() << ": Wait for back-edge to resolve.\n");
return false;
}
return true;
}
// Update def-ages for registers defined by MI.
// If Kill is set, also kill off DomainValues clobbered by the defs.
//
// Also break dependencies on partial defs and undef uses.
void ExeDepsFix::processDefs(MachineInstr *MI, bool Kill) {
assert(!MI->isDebugValue() && "Won't process debug values");
// Break dependence on undef uses. Do this before updating LiveRegs below.
unsigned OpNum;
unsigned Pref = TII->getUndefRegClearance(*MI, OpNum, TRI);
if (Pref) {
pickBestRegisterForUndef(MI, OpNum, Pref);
if (shouldBreakDependence(MI, OpNum, Pref))
UndefReads.push_back(std::make_pair(MI, OpNum));
}
const MCInstrDesc &MCID = MI->getDesc();
for (unsigned i = 0,
e = MI->isVariadic() ? MI->getNumOperands() : MCID.getNumDefs();
i != e; ++i) {
MachineOperand &MO = MI->getOperand(i);
if (!MO.isReg())
continue;
if (MO.isUse())
continue;
for (int rx : regIndices(MO.getReg())) {
// This instruction explicitly defines rx.
DEBUG(dbgs() << TRI->getName(RC->getRegister(rx)) << ":\t" << CurInstr
<< '\t' << *MI);
// Check clearance before partial register updates.
// Call breakDependence before setting LiveRegs[rx].Def.
unsigned Pref = TII->getPartialRegUpdateClearance(*MI, i, TRI);
if (Pref && shouldBreakDependence(MI, i, Pref))
TII->breakPartialRegDependency(*MI, i, TRI);
// How many instructions since rx was last written?
LiveRegs[rx].Def = CurInstr;
// Kill off domains redefined by generic instructions.
if (Kill)
kill(rx);
}
}
++CurInstr;
}
/// \break Break false dependencies on undefined register reads.
///
/// Walk the block backward computing precise liveness. This is expensive, so we
/// only do it on demand. Note that the occurrence of undefined register reads
/// that should be broken is very rare, but when they occur we may have many in
/// a single block.
void ExeDepsFix::processUndefReads(MachineBasicBlock *MBB) {
if (UndefReads.empty())
return;
// Collect this block's live out register units.
LiveRegSet.init(*TRI);
// We do not need to care about pristine registers as they are just preserved
// but not actually used in the function.
LiveRegSet.addLiveOutsNoPristines(*MBB);
MachineInstr *UndefMI = UndefReads.back().first;
unsigned OpIdx = UndefReads.back().second;
for (MachineInstr &I : make_range(MBB->rbegin(), MBB->rend())) {
// Update liveness, including the current instruction's defs.
LiveRegSet.stepBackward(I);
if (UndefMI == &I) {
if (!LiveRegSet.contains(UndefMI->getOperand(OpIdx).getReg()))
TII->breakPartialRegDependency(*UndefMI, OpIdx, TRI);
UndefReads.pop_back();
if (UndefReads.empty())
return;
UndefMI = UndefReads.back().first;
OpIdx = UndefReads.back().second;
}
}
}
// A hard instruction only works in one domain. All input registers will be
// forced into that domain.
void ExeDepsFix::visitHardInstr(MachineInstr *mi, unsigned domain) {
// Collapse all uses.
for (unsigned i = mi->getDesc().getNumDefs(),
e = mi->getDesc().getNumOperands(); i != e; ++i) {
MachineOperand &mo = mi->getOperand(i);
if (!mo.isReg()) continue;
for (int rx : regIndices(mo.getReg())) {
force(rx, domain);
}
}
// Kill all defs and force them.
for (unsigned i = 0, e = mi->getDesc().getNumDefs(); i != e; ++i) {
MachineOperand &mo = mi->getOperand(i);
if (!mo.isReg()) continue;
for (int rx : regIndices(mo.getReg())) {
kill(rx);
force(rx, domain);
}
}
}
// A soft instruction can be changed to work in other domains given by mask.
void ExeDepsFix::visitSoftInstr(MachineInstr *mi, unsigned mask) {
// Bitmask of available domains for this instruction after taking collapsed
// operands into account.
unsigned available = mask;
// Scan the explicit use operands for incoming domains.
SmallVector used;
if (LiveRegs)
for (unsigned i = mi->getDesc().getNumDefs(),
e = mi->getDesc().getNumOperands(); i != e; ++i) {
MachineOperand &mo = mi->getOperand(i);
if (!mo.isReg()) continue;
for (int rx : regIndices(mo.getReg())) {
DomainValue *dv = LiveRegs[rx].Value;
if (dv == nullptr)
continue;
// Bitmask of domains that dv and available have in common.
unsigned common = dv->getCommonDomains(available);
// Is it possible to use this collapsed register for free?
if (dv->isCollapsed()) {
// Restrict available domains to the ones in common with the operand.
// If there are no common domains, we must pay the cross-domain
// penalty for this operand.
if (common) available = common;
} else if (common)
// Open DomainValue is compatible, save it for merging.
used.push_back(rx);
else
// Open DomainValue is not compatible with instruction. It is useless
// now.
kill(rx);
}
}
// If the collapsed operands force a single domain, propagate the collapse.
if (isPowerOf2_32(available)) {
unsigned domain = countTrailingZeros(available);
TII->setExecutionDomain(*mi, domain);
visitHardInstr(mi, domain);
return;
}
// Kill off any remaining uses that don't match available, and build a list of
// incoming DomainValues that we want to merge.
- SmallVector Regs;
- for (SmallVectorImpl::iterator i=used.begin(), e=used.end(); i!=e; ++i) {
- int rx = *i;
+ SmallVector Regs;
+ for (int rx : used) {
assert(LiveRegs && "no space allocated for live registers");
const LiveReg &LR = LiveRegs[rx];
// This useless DomainValue could have been missed above.
if (!LR.Value->getCommonDomains(available)) {
kill(rx);
continue;
}
// Sorted insertion.
- bool Inserted = false;
- for (SmallVectorImpl::iterator i = Regs.begin(), e = Regs.end();
- i != e && !Inserted; ++i) {
- if (LR.Def < i->Def) {
- Inserted = true;
- Regs.insert(i, LR);
- }
- }
- if (!Inserted)
- Regs.push_back(LR);
+ auto I = std::upper_bound(Regs.begin(), Regs.end(), &LR,
+ [](const LiveReg *LHS, const LiveReg *RHS) {
+ return LHS->Def < RHS->Def;
+ });
+ Regs.insert(I, &LR);
}
// doms are now sorted in order of appearance. Try to merge them all, giving
// priority to the latest ones.
DomainValue *dv = nullptr;
while (!Regs.empty()) {
if (!dv) {
- dv = Regs.pop_back_val().Value;
+ dv = Regs.pop_back_val()->Value;
// Force the first dv to match the current instruction.
dv->AvailableDomains = dv->getCommonDomains(available);
assert(dv->AvailableDomains && "Domain should have been filtered");
continue;
}
- DomainValue *Latest = Regs.pop_back_val().Value;
+ DomainValue *Latest = Regs.pop_back_val()->Value;
// Skip already merged values.
if (Latest == dv || Latest->Next)
continue;
if (merge(dv, Latest))
continue;
// If latest didn't merge, it is useless now. Kill all registers using it.
for (int i : used) {
assert(LiveRegs && "no space allocated for live registers");
if (LiveRegs[i].Value == Latest)
kill(i);
}
}
// dv is the DomainValue we are going to use for this instruction.
if (!dv) {
dv = alloc();
dv->AvailableDomains = available;
}
dv->Instrs.push_back(mi);
// Finally set all defs and non-collapsed uses to dv. We must iterate through
// all the operators, including imp-def ones.
for (MachineInstr::mop_iterator ii = mi->operands_begin(),
ee = mi->operands_end();
ii != ee; ++ii) {
MachineOperand &mo = *ii;
if (!mo.isReg()) continue;
for (int rx : regIndices(mo.getReg())) {
if (!LiveRegs[rx].Value || (mo.isDef() && LiveRegs[rx].Value != dv)) {
kill(rx);
setLiveReg(rx, dv);
}
}
}
}
bool ExeDepsFix::runOnMachineFunction(MachineFunction &mf) {
if (skipFunction(*mf.getFunction()))
return false;
MF = &mf;
TII = MF->getSubtarget().getInstrInfo();
TRI = MF->getSubtarget().getRegisterInfo();
RegClassInfo.runOnMachineFunction(mf);
LiveRegs = nullptr;
assert(NumRegs == RC->getNumRegs() && "Bad regclass");
DEBUG(dbgs() << "********** FIX EXECUTION DEPENDENCIES: "
<< TRI->getRegClassName(RC) << " **********\n");
// If no relevant registers are used in the function, we can skip it
// completely.
bool anyregs = false;
const MachineRegisterInfo &MRI = mf.getRegInfo();
for (unsigned Reg : *RC) {
if (MRI.isPhysRegUsed(Reg)) {
anyregs = true;
break;
}
}
if (!anyregs) return false;
// Initialize the AliasMap on the first use.
if (AliasMap.empty()) {
// Given a PhysReg, AliasMap[PhysReg] returns a list of indices into RC and
// therefore the LiveRegs array.
AliasMap.resize(TRI->getNumRegs());
for (unsigned i = 0, e = RC->getNumRegs(); i != e; ++i)
for (MCRegAliasIterator AI(RC->getRegister(i), TRI, true);
AI.isValid(); ++AI)
AliasMap[*AI].push_back(i);
}
MachineBasicBlock *Entry = &*MF->begin();
ReversePostOrderTraversal RPOT(Entry);
SmallVector Loops;
for (ReversePostOrderTraversal::rpo_iterator
MBBI = RPOT.begin(), MBBE = RPOT.end(); MBBI != MBBE; ++MBBI) {
MachineBasicBlock *MBB = *MBBI;
enterBasicBlock(MBB);
if (SeenUnknownBackEdge)
Loops.push_back(MBB);
for (MachineInstr &MI : *MBB)
visitInstr(&MI);
processUndefReads(MBB);
leaveBasicBlock(MBB);
}
// Visit all the loop blocks again in order to merge DomainValues from
// back-edges.
for (MachineBasicBlock *MBB : Loops) {
enterBasicBlock(MBB);
for (MachineInstr &MI : *MBB)
if (!MI.isDebugValue())
processDefs(&MI, false);
processUndefReads(MBB);
leaveBasicBlock(MBB);
}
// Clear the LiveOuts vectors and collapse any remaining DomainValues.
for (ReversePostOrderTraversal::rpo_iterator
MBBI = RPOT.begin(), MBBE = RPOT.end(); MBBI != MBBE; ++MBBI) {
LiveOutMap::const_iterator FI = LiveOuts.find(*MBBI);
if (FI == LiveOuts.end() || !FI->second)
continue;
for (unsigned i = 0, e = NumRegs; i != e; ++i)
if (FI->second[i].Value)
release(FI->second[i].Value);
delete[] FI->second;
}
LiveOuts.clear();
UndefReads.clear();
Avail.clear();
Allocator.DestroyAll();
return false;
}
FunctionPass *
llvm::createExecutionDependencyFixPass(const TargetRegisterClass *RC) {
return new ExeDepsFix(RC);
}
Index: vendor/llvm/dist/test/CodeGen/X86/pr30284.ll
===================================================================
--- vendor/llvm/dist/test/CodeGen/X86/pr30284.ll (nonexistent)
+++ vendor/llvm/dist/test/CodeGen/X86/pr30284.ll (revision 314411)
@@ -0,0 +1,22 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=i386-unknown-linux-gnu -mattr=avx512dq | FileCheck %s
+
+define void @f_f___un_3C_unf_3E_un_3C_unf_3E_() {
+; CHECK-LABEL: f_f___un_3C_unf_3E_un_3C_unf_3E_:
+; CHECK: # BB#0:
+; CHECK-NEXT: vmovapd 0, %zmm0
+; CHECK-NEXT: vmovapd 64, %zmm1
+; CHECK-NEXT: vmovapd {{.*#+}} zmm2 = [0,16,0,16,0,16,0,16,0,16,0,16,0,16,0,16]
+; CHECK-NEXT: kshiftrw $8, %k0, %k1
+; CHECK-NEXT: vorpd %zmm2, %zmm1, %zmm1 {%k1}
+; CHECK-NEXT: vorpd %zmm2, %zmm0, %zmm0 {%k1}
+; CHECK-NEXT: vmovapd %zmm0, 0
+; CHECK-NEXT: vmovapd %zmm1, 64
+; CHECK-NEXT: retl
+ %a_load22 = load <16 x i64>, <16 x i64>* null, align 1
+ %bitop = or <16 x i64> %a_load22,
+ %v.i = load <16 x i64>, <16 x i64>* null
+ %v1.i41 = select <16 x i1> undef, <16 x i64> %bitop, <16 x i64> %v.i
+ store <16 x i64> %v1.i41, <16 x i64>* null
+ ret void
+}
Index: vendor/llvm/dist/test/tools/llvm-xray/X86/Inputs/simple-xray-instrmap.yaml
===================================================================
--- vendor/llvm/dist/test/tools/llvm-xray/X86/Inputs/simple-xray-instrmap.yaml (revision 314410)
+++ vendor/llvm/dist/test/tools/llvm-xray/X86/Inputs/simple-xray-instrmap.yaml (revision 314411)
@@ -1,14 +1,8 @@
---
-- { id: 1, address: 0x000000000041CA40, function: 0x000000000041CA40, kind: function-enter,
- always-instrument: true }
-- { id: 1, address: 0x000000000041CA50, function: 0x000000000041CA40, kind: tail-exit,
- always-instrument: true }
-- { id: 2, address: 0x000000000041CA70, function: 0x000000000041CA70, kind: function-enter,
- always-instrument: true }
-- { id: 2, address: 0x000000000041CA7C, function: 0x000000000041CA70, kind: tail-exit,
- always-instrument: true }
-- { id: 3, address: 0x000000000041CAA0, function: 0x000000000041CAA0, kind: function-enter,
- always-instrument: true }
-- { id: 3, address: 0x000000000041CAB4, function: 0x000000000041CAA0, kind: function-exit,
- always-instrument: true }
+- { id: 1, address: 0x000000000041CA40, function: 0x000000000041CA40, kind: function-enter, always-instrument: true }
+- { id: 1, address: 0x000000000041CA50, function: 0x000000000041CA40, kind: tail-exit, always-instrument: true }
+- { id: 2, address: 0x000000000041CA70, function: 0x000000000041CA70, kind: function-enter, always-instrument: true }
+- { id: 2, address: 0x000000000041CA7C, function: 0x000000000041CA70, kind: tail-exit, always-instrument: true }
+- { id: 3, address: 0x000000000041CAA0, function: 0x000000000041CAA0, kind: function-enter, always-instrument: true }
+- { id: 3, address: 0x000000000041CAB4, function: 0x000000000041CAA0, kind: function-exit, always-instrument: true }
...
Index: vendor/llvm/dist/test/tools/llvm-xray/X86/account-simple-case.yaml
===================================================================
--- vendor/llvm/dist/test/tools/llvm-xray/X86/account-simple-case.yaml (revision 314410)
+++ vendor/llvm/dist/test/tools/llvm-xray/X86/account-simple-case.yaml (revision 314411)
@@ -1,18 +1,16 @@
#RUN: llvm-xray account %s -o - -m %S/Inputs/simple-instrmap.yaml -t yaml | FileCheck %s
---
header:
version: 1
type: 0
constant-tsc: true
nonstop-tsc: true
cycle-frequency: 2601000000
records:
- - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-enter,
- tsc: 10001 }
- - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-exit,
- tsc: 10100 }
+ - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-enter, tsc: 10001 }
+ - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-exit, tsc: 10100 }
...
#CHECK: Functions with latencies: 1
#CHECK-NEXT: funcid count [ min, med, 90p, 99p, max] sum function
#CHECK-NEXT: 1 1 [ {{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
Index: vendor/llvm/dist/test/tools/llvm-xray/X86/account-simple-sorting.yaml
===================================================================
--- vendor/llvm/dist/test/tools/llvm-xray/X86/account-simple-sorting.yaml (revision 314410)
+++ vendor/llvm/dist/test/tools/llvm-xray/X86/account-simple-sorting.yaml (revision 314411)
@@ -1,85 +1,75 @@
#RUN: llvm-xray account %s -o - -m %S/Inputs/simple-instrmap.yaml -t yaml | FileCheck --check-prefix DEFAULT %s
#RUN: llvm-xray account %s -o - -m %S/Inputs/simple-instrmap.yaml -t yaml -s count | FileCheck --check-prefix COUNT-ASC %s
#RUN: llvm-xray account %s -o - -m %S/Inputs/simple-instrmap.yaml -t yaml -s min | FileCheck --check-prefix MIN-ASC %s
#RUN: llvm-xray account %s -o - -m %S/Inputs/simple-instrmap.yaml -t yaml -s max | FileCheck --check-prefix MAX-ASC %s
#RUN: llvm-xray account %s -o - -m %S/Inputs/simple-instrmap.yaml -t yaml -s sum | FileCheck --check-prefix SUM-ASC %s
#RUN: llvm-xray account %s -o - -m %S/Inputs/simple-instrmap.yaml -t yaml -s count -r dsc | FileCheck --check-prefix COUNT-DSC %s
#RUN: llvm-xray account %s -o - -m %S/Inputs/simple-instrmap.yaml -t yaml -s min -r dsc | FileCheck --check-prefix MIN-DSC %s
#RUN: llvm-xray account %s -o - -m %S/Inputs/simple-instrmap.yaml -t yaml -s max -r dsc | FileCheck --check-prefix MAX-DSC %s
#RUN: llvm-xray account %s -o - -m %S/Inputs/simple-instrmap.yaml -t yaml -s sum -r dsc | FileCheck --check-prefix SUM-DSC %s
---
header:
version: 1
type: 0
constant-tsc: true
nonstop-tsc: true
cycle-frequency: 1
records:
# Function id: 1
- - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-enter,
- tsc: 10001 }
- - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-exit,
- tsc: 10100 }
- - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-enter,
- tsc: 10101 }
- - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-exit,
- tsc: 10200 }
- - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-enter,
- tsc: 10201 }
- - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-exit,
- tsc: 10300 }
+ - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-enter, tsc: 10001 }
+ - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-exit, tsc: 10100 }
+ - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-enter, tsc: 10101 }
+ - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-exit, tsc: 10200 }
+ - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-enter, tsc: 10201 }
+ - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-exit, tsc: 10300 }
# Function id: 2
- - { type: 0, func-id: 2, cpu: 1, thread: 222, kind: function-enter,
- tsc: 10001 }
- - { type: 0, func-id: 2, cpu: 1, thread: 222, kind: function-exit,
- tsc: 10002 }
- - { type: 0, func-id: 2, cpu: 1, thread: 222, kind: function-enter,
- tsc: 10101 }
- - { type: 0, func-id: 2, cpu: 1, thread: 222, kind: function-exit,
- tsc: 10102 }
+ - { type: 0, func-id: 2, cpu: 1, thread: 222, kind: function-enter, tsc: 10001 }
+ - { type: 0, func-id: 2, cpu: 1, thread: 222, kind: function-exit, tsc: 10002 }
+ - { type: 0, func-id: 2, cpu: 1, thread: 222, kind: function-enter, tsc: 10101 }
+ - { type: 0, func-id: 2, cpu: 1, thread: 222, kind: function-exit, tsc: 10102 }
#DEFAULT: Functions with latencies: 2
#DEFAULT-NEXT: funcid count [ min, med, 90p, 99p, max] sum function
#DEFAULT-NEXT: 1 3 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#DEFAULT-NEXT: 2 2 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#COUNT-ASC: Functions with latencies: 2
#COUNT-ASC-NEXT: funcid count [ min, med, 90p, 99p, max] sum function
#COUNT-ASC-NEXT: 2 2 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#COUNT-ASC-NEXT: 1 3 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#COUNT-DSC: Functions with latencies: 2
#COUNT-DSC-NEXT: funcid count [ min, med, 90p, 99p, max] sum function
#COUNT-DSC-NEXT: 1 3 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#COUNT-DSC-NEXT: 2 2 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#MIN-ASC: Functions with latencies: 2
#MIN-ASC-NEXT: funcid count [ min, med, 90p, 99p, max] sum function
#MIN-ASC-NEXT: 2 2 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#MIN-ASC-NEXT: 1 3 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#MIN-DSC: Functions with latencies: 2
#MIN-DSC-NEXT: funcid count [ min, med, 90p, 99p, max] sum function
#MIN-DSC-NEXT: 1 3 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#MIN-DSC-NEXT: 2 2 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#MAX-ASC: Functions with latencies: 2
#MAX-ASC-NEXT: funcid count [ min, med, 90p, 99p, max] sum function
#MAX-ASC-NEXT: 2 2 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#MAX-ASC-NEXT: 1 3 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#MAX-DSC: Functions with latencies: 2
#MAX-DSC-NEXT: funcid count [ min, med, 90p, 99p, max] sum function
#MAX-DSC-NEXT: 1 3 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#MAX-DSC-NEXT: 2 2 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#SUM-ASC: Functions with latencies: 2
#SUM-ASC-NEXT: funcid count [ min, med, 90p, 99p, max] sum function
#SUM-ASC-NEXT: 2 2 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#SUM-ASC-NEXT: 1 3 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#SUM-DSC: Functions with latencies: 2
#SUM-DSC-NEXT: funcid count [ min, med, 90p, 99p, max] sum function
#SUM-DSC-NEXT: 1 3 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
#SUM-DSC-NEXT: 2 2 [{{.*}}, {{.*}}, {{.*}}, {{.*}}, {{.*}}] {{.*}} {{.*}}
Index: vendor/llvm/dist/test/tools/llvm-xray/X86/convert-roundtrip.yaml
===================================================================
--- vendor/llvm/dist/test/tools/llvm-xray/X86/convert-roundtrip.yaml (revision 314410)
+++ vendor/llvm/dist/test/tools/llvm-xray/X86/convert-roundtrip.yaml (revision 314411)
@@ -1,28 +1,24 @@
#RUN: llvm-xray convert %s -f=raw -o %t && llvm-xray convert %t -f=yaml -o - | FileCheck %s
---
header:
version: 1
type: 0
constant-tsc: true
nonstop-tsc: true
cycle-frequency: 2601000000
records:
- - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-enter,
- tsc: 10001 }
- - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-exit,
- tsc: 10100 }
+ - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-enter, tsc: 10001 }
+ - { type: 0, func-id: 1, cpu: 1, thread: 111, kind: function-exit, tsc: 10100 }
...
#CHECK: ---
#CHECK-NEXT: header:
#CHECK-NEXT: version: 1
#CHECK-NEXT: type: 0
#CHECK-NEXT: constant-tsc: true
#CHECK-NEXT: nonstop-tsc: true
#CHECK-NEXT: cycle-frequency: 2601000000
#CHECK-NEXT: records:
-#CHECK-NEXT: - { type: 0, func-id: 1, function: '1', cpu: 1, thread: 111, kind: function-enter,
-#CHECK-NEXT: tsc: 10001 }
-#CHECK-NEXT: - { type: 0, func-id: 1, function: '1', cpu: 1, thread: 111, kind: function-exit,
-#CHECK-NEXT: tsc: 10100 }
+#CHECK-NEXT: - { type: 0, func-id: 1, function: '1', cpu: 1, thread: 111, kind: function-enter, tsc: 10001 }
+#CHECK-NEXT: - { type: 0, func-id: 1, function: '1', cpu: 1, thread: 111, kind: function-exit, tsc: 10100 }
#CHECK-NEXT: ...
Index: vendor/llvm/dist/test/tools/llvm-xray/X86/convert-to-yaml.txt
===================================================================
--- vendor/llvm/dist/test/tools/llvm-xray/X86/convert-to-yaml.txt (revision 314410)
+++ vendor/llvm/dist/test/tools/llvm-xray/X86/convert-to-yaml.txt (revision 314411)
@@ -1,23 +1,17 @@
; RUN: llvm-xray convert %S/Inputs/naive-log-simple.xray -f=yaml -o - | FileCheck %s
; CHECK: ---
; CHECK-NEXT: header:
; CHECK-NEXT: version: 1
; CHECK-NEXT: type: 0
; CHECK-NEXT: constant-tsc: true
; CHECK-NEXT: nonstop-tsc: true
; CHECK-NEXT: cycle-frequency: 2601000000
; CHECK-NEXT: records:
-; CHECK-NEXT: - { type: 0, func-id: 3, function: '3', cpu: 37, thread: 84697, kind: function-enter,
-; CHECK-NEXT: tsc: 3315356841453914 }
-; CHECK-NEXT: - { type: 0, func-id: 2, function: '2', cpu: 37, thread: 84697, kind: function-enter,
-; CHECK-NEXT: tsc: 3315356841454542 }
-; CHECK-NEXT: - { type: 0, func-id: 2, function: '2', cpu: 37, thread: 84697, kind: function-exit,
-; CHECK-NEXT: tsc: 3315356841454670 }
-; CHECK-NEXT: - { type: 0, func-id: 1, function: '1', cpu: 37, thread: 84697, kind: function-enter,
-; CHECK-NEXT: tsc: 3315356841454762 }
-; CHECK-NEXT: - { type: 0, func-id: 1, function: '1', cpu: 37, thread: 84697, kind: function-exit,
-; CHECK-NEXT: tsc: 3315356841454802 }
-; CHECK-NEXT: - { type: 0, func-id: 3, function: '3', cpu: 37, thread: 84697, kind: function-exit,
-; CHECK-NEXT: tsc: 3315356841494828 }
+; CHECK-NEXT: - { type: 0, func-id: 3, function: '3', cpu: 37, thread: 84697, kind: function-enter, tsc: 3315356841453914 }
+; CHECK-NEXT: - { type: 0, func-id: 2, function: '2', cpu: 37, thread: 84697, kind: function-enter, tsc: 3315356841454542 }
+; CHECK-NEXT: - { type: 0, func-id: 2, function: '2', cpu: 37, thread: 84697, kind: function-exit, tsc: 3315356841454670 }
+; CHECK-NEXT: - { type: 0, func-id: 1, function: '1', cpu: 37, thread: 84697, kind: function-enter, tsc: 3315356841454762 }
+; CHECK-NEXT: - { type: 0, func-id: 1, function: '1', cpu: 37, thread: 84697, kind: function-exit, tsc: 3315356841454802 }
+; CHECK-NEXT: - { type: 0, func-id: 3, function: '3', cpu: 37, thread: 84697, kind: function-exit, tsc: 3315356841494828 }
; CHECK-NEXT: ...
Index: vendor/llvm/dist/test/tools/llvm-xray/X86/convert-with-debug-syms.txt
===================================================================
--- vendor/llvm/dist/test/tools/llvm-xray/X86/convert-with-debug-syms.txt (revision 314410)
+++ vendor/llvm/dist/test/tools/llvm-xray/X86/convert-with-debug-syms.txt (revision 314411)
@@ -1,23 +1,17 @@
; RUN: llvm-xray convert -m %S/Inputs/elf64-sample-o2.bin -y %S/Inputs/naive-log-simple.xray -f=yaml -o - 2>&1 | FileCheck %s
; CHECK: ---
; CHECK-NEXT: header:
; CHECK-NEXT: version: 1
; CHECK-NEXT: type: 0
; CHECK-NEXT: constant-tsc: true
; CHECK-NEXT: nonstop-tsc: true
; CHECK-NEXT: cycle-frequency: 2601000000
; CHECK-NEXT: records:
-; CHECK-NEXT: - { type: 0, func-id: 3, function: main, cpu: 37, thread: 84697, kind: function-enter,
-; CHECK-NEXT: tsc: 3315356841453914 }
-; CHECK-NEXT: - { type: 0, func-id: 2, function: {{.*foo.*}}, cpu: 37, thread: 84697, kind: function-enter,
-; CHECK-NEXT: tsc: 3315356841454542 }
-; CHECK-NEXT: - { type: 0, func-id: 2, function: {{.*foo.*}}, cpu: 37, thread: 84697, kind: function-exit,
-; CHECK-NEXT: tsc: 3315356841454670 }
-; CHECK-NEXT: - { type: 0, func-id: 1, function: {{.*bar.*}}, cpu: 37, thread: 84697, kind: function-enter,
-; CHECK-NEXT: tsc: 3315356841454762 }
-; CHECK-NEXT: - { type: 0, func-id: 1, function: {{.*bar.*}}, cpu: 37, thread: 84697, kind: function-exit,
-; CHECK-NEXT: tsc: 3315356841454802 }
-; CHECK-NEXT: - { type: 0, func-id: 3, function: main, cpu: 37, thread: 84697, kind: function-exit,
-; CHECK-NEXT: tsc: 3315356841494828 }
+; CHECK-NEXT: - { type: 0, func-id: 3, function: main, cpu: 37, thread: 84697, kind: function-enter, tsc: 3315356841453914 }
+; CHECK-NEXT: - { type: 0, func-id: 2, function: {{.*foo.*}}, cpu: 37, thread: 84697, kind: function-enter, tsc: 3315356841454542 }
+; CHECK-NEXT: - { type: 0, func-id: 2, function: {{.*foo.*}}, cpu: 37, thread: 84697, kind: function-exit, tsc: 3315356841454670 }
+; CHECK-NEXT: - { type: 0, func-id: 1, function: {{.*bar.*}}, cpu: 37, thread: 84697, kind: function-enter, tsc: 3315356841454762 }
+; CHECK-NEXT: - { type: 0, func-id: 1, function: {{.*bar.*}}, cpu: 37, thread: 84697, kind: function-exit, tsc: 3315356841454802 }
+; CHECK-NEXT: - { type: 0, func-id: 3, function: main, cpu: 37, thread: 84697, kind: function-exit, tsc: 3315356841494828 }
; CHECK-NEXT: ...
Index: vendor/llvm/dist/test/tools/llvm-xray/X86/convert-with-standalone-instrmap.txt
===================================================================
--- vendor/llvm/dist/test/tools/llvm-xray/X86/convert-with-standalone-instrmap.txt (revision 314410)
+++ vendor/llvm/dist/test/tools/llvm-xray/X86/convert-with-standalone-instrmap.txt (revision 314411)
@@ -1,23 +1,17 @@
; RUN: llvm-xray convert -m %S/Inputs/elf64-objcopied-instrmap.bin -y %S/Inputs/naive-log-simple.xray -f=yaml -o - 2>&1 | FileCheck %s
; CHECK: ---
; CHECK-NEXT: header:
; CHECK-NEXT: version: 1
; CHECK-NEXT: type: 0
; CHECK-NEXT: constant-tsc: true
; CHECK-NEXT: nonstop-tsc: true
; CHECK-NEXT: cycle-frequency: 2601000000
; CHECK-NEXT: records:
-; CHECK-NEXT: - { type: 0, func-id: 3, function: '@(41caa0)', cpu: 37, thread: 84697,
-; CHECK-NEXT: kind: function-enter, tsc: 3315356841453914 }
-; CHECK-NEXT: - { type: 0, func-id: 2, function: '@(41ca70)', cpu: 37, thread: 84697,
-; CHECK-NEXT: kind: function-enter, tsc: 3315356841454542 }
-; CHECK-NEXT: - { type: 0, func-id: 2, function: '@(41ca70)', cpu: 37, thread: 84697,
-; CHECK-NEXT: kind: function-exit, tsc: 3315356841454670 }
-; CHECK-NEXT: - { type: 0, func-id: 1, function: '@(41ca40)', cpu: 37, thread: 84697,
-; CHECK-NEXT: kind: function-enter, tsc: 3315356841454762 }
-; CHECK-NEXT: - { type: 0, func-id: 1, function: '@(41ca40)', cpu: 37, thread: 84697,
-; CHECK-NEXT: kind: function-exit, tsc: 3315356841454802 }
-; CHECK-NEXT: - { type: 0, func-id: 3, function: '@(41caa0)', cpu: 37, thread: 84697,
-; CHECK-NEXT: kind: function-exit, tsc: 3315356841494828 }
+; CHECK-NEXT: - { type: 0, func-id: 3, function: '@(41caa0)', cpu: 37, thread: 84697, kind: function-enter, tsc: 3315356841453914 }
+; CHECK-NEXT: - { type: 0, func-id: 2, function: '@(41ca70)', cpu: 37, thread: 84697, kind: function-enter, tsc: 3315356841454542 }
+; CHECK-NEXT: - { type: 0, func-id: 2, function: '@(41ca70)', cpu: 37, thread: 84697, kind: function-exit, tsc: 3315356841454670 }
+; CHECK-NEXT: - { type: 0, func-id: 1, function: '@(41ca40)', cpu: 37, thread: 84697, kind: function-enter, tsc: 3315356841454762 }
+; CHECK-NEXT: - { type: 0, func-id: 1, function: '@(41ca40)', cpu: 37, thread: 84697, kind: function-exit, tsc: 3315356841454802 }
+; CHECK-NEXT: - { type: 0, func-id: 3, function: '@(41caa0)', cpu: 37, thread: 84697, kind: function-exit, tsc: 3315356841494828 }
; CHECK-NEXT: ...
Index: vendor/llvm/dist/test/tools/llvm-xray/X86/convert-with-yaml-instrmap.txt
===================================================================
--- vendor/llvm/dist/test/tools/llvm-xray/X86/convert-with-yaml-instrmap.txt (revision 314410)
+++ vendor/llvm/dist/test/tools/llvm-xray/X86/convert-with-yaml-instrmap.txt (revision 314411)
@@ -1,23 +1,17 @@
; RUN: llvm-xray convert -m %S/Inputs/simple-xray-instrmap.yaml -t yaml %S/Inputs/naive-log-simple.xray -f=yaml -o - | FileCheck %s
; CHECK: ---
; CHECK-NEXT: header:
; CHECK-NEXT: version: 1
; CHECK-NEXT: type: 0
; CHECK-NEXT: constant-tsc: true
; CHECK-NEXT: nonstop-tsc: true
; CHECK-NEXT: cycle-frequency: 2601000000
; CHECK-NEXT: records:
-; CHECK-NEXT: - { type: 0, func-id: 3, function: '3', cpu: 37, thread: 84697, kind: function-enter,
-; CHECK-NEXT: tsc: 3315356841453914 }
-; CHECK-NEXT: - { type: 0, func-id: 2, function: '2', cpu: 37, thread: 84697, kind: function-enter,
-; CHECK-NEXT: tsc: 3315356841454542 }
-; CHECK-NEXT: - { type: 0, func-id: 2, function: '2', cpu: 37, thread: 84697, kind: function-exit,
-; CHECK-NEXT: tsc: 3315356841454670 }
-; CHECK-NEXT: - { type: 0, func-id: 1, function: '1', cpu: 37, thread: 84697, kind: function-enter,
-; CHECK-NEXT: tsc: 3315356841454762 }
-; CHECK-NEXT: - { type: 0, func-id: 1, function: '1', cpu: 37, thread: 84697, kind: function-exit,
-; CHECK-NEXT: tsc: 3315356841454802 }
-; CHECK-NEXT: - { type: 0, func-id: 3, function: '3', cpu: 37, thread: 84697, kind: function-exit,
-; CHECK-NEXT: tsc: 3315356841494828 }
+; CHECK-NEXT: - { type: 0, func-id: 3, function: '3', cpu: 37, thread: 84697, kind: function-enter, tsc: 3315356841453914 }
+; CHECK-NEXT: - { type: 0, func-id: 2, function: '2', cpu: 37, thread: 84697, kind: function-enter, tsc: 3315356841454542 }
+; CHECK-NEXT: - { type: 0, func-id: 2, function: '2', cpu: 37, thread: 84697, kind: function-exit, tsc: 3315356841454670 }
+; CHECK-NEXT: - { type: 0, func-id: 1, function: '1', cpu: 37, thread: 84697, kind: function-enter, tsc: 3315356841454762 }
+; CHECK-NEXT: - { type: 0, func-id: 1, function: '1', cpu: 37, thread: 84697, kind: function-exit, tsc: 3315356841454802 }
+; CHECK-NEXT: - { type: 0, func-id: 3, function: '3', cpu: 37, thread: 84697, kind: function-exit, tsc: 3315356841494828 }
; CHECK-NEXT: ...
Index: vendor/llvm/dist/test/tools/llvm-xray/X86/extract-instrmap.ll
===================================================================
--- vendor/llvm/dist/test/tools/llvm-xray/X86/extract-instrmap.ll (revision 314410)
+++ vendor/llvm/dist/test/tools/llvm-xray/X86/extract-instrmap.ll (revision 314411)
@@ -1,15 +1,11 @@
; This test makes sure we can extract the instrumentation map from an
; XRay-instrumented object file.
;
; RUN: llvm-xray extract %S/Inputs/elf64-example.bin | FileCheck %s
; CHECK: ---
-; CHECK-NEXT: - { id: 1, address: 0x000000000041C900, function: 0x000000000041C900, kind: function-enter,
-; CHECK-NEXT: always-instrument: true }
-; CHECK-NEXT: - { id: 1, address: 0x000000000041C912, function: 0x000000000041C900, kind: function-exit,
-; CHECK-NEXT: always-instrument: true }
-; CHECK-NEXT: - { id: 2, address: 0x000000000041C930, function: 0x000000000041C930, kind: function-enter,
-; CHECK-NEXT: always-instrument: true }
-; CHECK-NEXT: - { id: 2, address: 0x000000000041C946, function: 0x000000000041C930, kind: function-exit,
-; CHECK-NEXT: always-instrument: true }
+; CHECK-NEXT: - { id: 1, address: 0x000000000041C900, function: 0x000000000041C900, kind: function-enter, always-instrument: true }
+; CHECK-NEXT: - { id: 1, address: 0x000000000041C912, function: 0x000000000041C900, kind: function-exit, always-instrument: true }
+; CHECK-NEXT: - { id: 2, address: 0x000000000041C930, function: 0x000000000041C930, kind: function-enter, always-instrument: true }
+; CHECK-NEXT: - { id: 2, address: 0x000000000041C946, function: 0x000000000041C930, kind: function-exit, always-instrument: true }
; CHECK-NEXT: ...
Index: vendor/llvm/dist/tools/llvm-xray/xray-converter.cc
===================================================================
--- vendor/llvm/dist/tools/llvm-xray/xray-converter.cc (revision 314410)
+++ vendor/llvm/dist/tools/llvm-xray/xray-converter.cc (revision 314411)
@@ -1,202 +1,202 @@
//===- xray-converter.cc - XRay Trace Conversion --------------------------===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// Implements the trace conversion functions.
//
//===----------------------------------------------------------------------===//
#include "xray-converter.h"
#include "xray-extract.h"
#include "xray-registry.h"
#include "llvm/DebugInfo/Symbolize/Symbolize.h"
#include "llvm/Support/EndianStream.h"
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/YAMLTraits.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/XRay/Trace.h"
#include "llvm/XRay/YAMLXRayRecord.h"
using namespace llvm;
using namespace xray;
// llvm-xray convert
// ----------------------------------------------------------------------------
static cl::SubCommand Convert("convert", "Trace Format Conversion");
static cl::opt ConvertInput(cl::Positional,
cl::desc(""),
cl::Required, cl::sub(Convert));
enum class ConvertFormats { BINARY, YAML };
static cl::opt ConvertOutputFormat(
"output-format", cl::desc("output format"),
cl::values(clEnumValN(ConvertFormats::BINARY, "raw", "output in binary"),
clEnumValN(ConvertFormats::YAML, "yaml", "output in yaml")),
cl::sub(Convert));
static cl::alias ConvertOutputFormat2("f", cl::aliasopt(ConvertOutputFormat),
cl::desc("Alias for -output-format"),
cl::sub(Convert));
static cl::opt
ConvertOutput("output", cl::value_desc("output file"), cl::init("-"),
cl::desc("output file; use '-' for stdout"),
cl::sub(Convert));
static cl::alias ConvertOutput2("o", cl::aliasopt(ConvertOutput),
cl::desc("Alias for -output"),
cl::sub(Convert));
static cl::opt
ConvertSymbolize("symbolize",
cl::desc("symbolize function ids from the input log"),
cl::init(false), cl::sub(Convert));
static cl::alias ConvertSymbolize2("y", cl::aliasopt(ConvertSymbolize),
cl::desc("Alias for -symbolize"),
cl::sub(Convert));
static cl::opt
ConvertInstrMap("instr_map",
cl::desc("binary with the instrumentation map, or "
"a separate instrumentation map"),
cl::value_desc("binary with xray_instr_map"),
cl::sub(Convert), cl::init(""));
static cl::alias ConvertInstrMap2("m", cl::aliasopt(ConvertInstrMap),
cl::desc("Alias for -instr_map"),
cl::sub(Convert));
static cl::opt ConvertSortInput(
"sort",
cl::desc("determines whether to sort input log records by timestamp"),
cl::sub(Convert), cl::init(true));
static cl::alias ConvertSortInput2("s", cl::aliasopt(ConvertSortInput),
cl::desc("Alias for -sort"),
cl::sub(Convert));
static cl::opt InstrMapFormat(
"instr-map-format", cl::desc("format of instrumentation map"),
cl::values(clEnumValN(InstrumentationMapExtractor::InputFormats::ELF, "elf",
"instrumentation map in an ELF header"),
clEnumValN(InstrumentationMapExtractor::InputFormats::YAML,
"yaml", "instrumentation map in YAML")),
cl::sub(Convert), cl::init(InstrumentationMapExtractor::InputFormats::ELF));
static cl::alias InstrMapFormat2("t", cl::aliasopt(InstrMapFormat),
cl::desc("Alias for -instr-map-format"),
cl::sub(Convert));
using llvm::yaml::IO;
using llvm::yaml::Output;
void TraceConverter::exportAsYAML(const Trace &Records, raw_ostream &OS) {
YAMLXRayTrace Trace;
const auto &FH = Records.getFileHeader();
Trace.Header = {FH.Version, FH.Type, FH.ConstantTSC, FH.NonstopTSC,
FH.CycleFrequency};
Trace.Records.reserve(Records.size());
for (const auto &R : Records) {
Trace.Records.push_back({R.RecordType, R.CPU, R.Type, R.FuncId,
Symbolize ? FuncIdHelper.SymbolOrNumber(R.FuncId)
: std::to_string(R.FuncId),
R.TSC, R.TId});
}
- Output Out(OS);
+ Output Out(OS, nullptr, 0);
Out << Trace;
}
void TraceConverter::exportAsRAWv1(const Trace &Records, raw_ostream &OS) {
// First write out the file header, in the correct endian-appropriate format
// (XRay assumes currently little endian).
support::endian::Writer Writer(OS);
const auto &FH = Records.getFileHeader();
Writer.write(FH.Version);
Writer.write(FH.Type);
uint32_t Bitfield{0};
if (FH.ConstantTSC)
Bitfield |= 1uL;
if (FH.NonstopTSC)
Bitfield |= 1uL << 1;
Writer.write(Bitfield);
Writer.write(FH.CycleFrequency);
// There's 16 bytes of padding at the end of the file header.
static constexpr uint32_t Padding4B = 0;
Writer.write(Padding4B);
Writer.write(Padding4B);
Writer.write(Padding4B);
Writer.write(Padding4B);
// Then write out the rest of the records, still in an endian-appropriate
// format.
for (const auto &R : Records) {
Writer.write(R.RecordType);
Writer.write(R.CPU);
switch (R.Type) {
case RecordTypes::ENTER:
Writer.write(uint8_t{0});
break;
case RecordTypes::EXIT:
Writer.write(uint8_t{1});
break;
}
Writer.write(R.FuncId);
Writer.write(R.TSC);
Writer.write(R.TId);
Writer.write(Padding4B);
Writer.write(Padding4B);
Writer.write(Padding4B);
}
}
namespace llvm {
namespace xray {
static CommandRegistration Unused(&Convert, []() -> Error {
// FIXME: Support conversion to BINARY when upgrading XRay trace versions.
int Fd;
auto EC = sys::fs::openFileForRead(ConvertInput, Fd);
if (EC)
return make_error(
Twine("Cannot open file '") + ConvertInput + "'", EC);
Error Err = Error::success();
xray::InstrumentationMapExtractor Extractor(ConvertInstrMap, InstrMapFormat,
Err);
handleAllErrors(std::move(Err),
[&](const ErrorInfoBase &E) { E.log(errs()); });
const auto &FunctionAddresses = Extractor.getFunctionAddresses();
symbolize::LLVMSymbolizer::Options Opts(
symbolize::FunctionNameKind::LinkageName, true, true, false, "");
symbolize::LLVMSymbolizer Symbolizer(Opts);
llvm::xray::FuncIdConversionHelper FuncIdHelper(ConvertInstrMap, Symbolizer,
FunctionAddresses);
llvm::xray::TraceConverter TC(FuncIdHelper, ConvertSymbolize);
raw_fd_ostream OS(ConvertOutput, EC,
ConvertOutputFormat == ConvertFormats::BINARY
? sys::fs::OpenFlags::F_None
: sys::fs::OpenFlags::F_Text);
if (EC)
return make_error(
Twine("Cannot open file '") + ConvertOutput + "' for writing.", EC);
if (auto TraceOrErr = loadTraceFile(ConvertInput, ConvertSortInput)) {
auto &T = *TraceOrErr;
switch (ConvertOutputFormat) {
case ConvertFormats::YAML:
TC.exportAsYAML(T, OS);
break;
case ConvertFormats::BINARY:
TC.exportAsRAWv1(T, OS);
break;
}
} else {
return joinErrors(
make_error(
Twine("Failed loading input file '") + ConvertInput + "'.",
std::make_error_code(std::errc::executable_format_error)),
TraceOrErr.takeError());
}
return Error::success();
});
} // namespace xray
} // namespace llvm
Index: vendor/llvm/dist/tools/llvm-xray/xray-extract.cc
===================================================================
--- vendor/llvm/dist/tools/llvm-xray/xray-extract.cc (revision 314410)
+++ vendor/llvm/dist/tools/llvm-xray/xray-extract.cc (revision 314411)
@@ -1,291 +1,291 @@
//===- xray-extract.cc - XRay Instrumentation Map Extraction --------------===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// Implementation of the xray-extract.h interface.
//
// FIXME: Support other XRay-instrumented binary formats other than ELF.
//
//===----------------------------------------------------------------------===//
#include
#include
#include "xray-extract.h"
#include "xray-registry.h"
#include "xray-sleds.h"
#include "llvm/Object/ELF.h"
#include "llvm/Object/ObjectFile.h"
#include "llvm/Support/CommandLine.h"
#include "llvm/Support/DataExtractor.h"
#include "llvm/Support/ELF.h"
#include "llvm/Support/Error.h"
#include "llvm/Support/FileSystem.h"
#include "llvm/Support/Format.h"
#include "llvm/Support/YAMLTraits.h"
#include "llvm/Support/raw_ostream.h"
using namespace llvm;
using namespace llvm::xray;
using namespace llvm::yaml;
// llvm-xray extract
// ----------------------------------------------------------------------------
static cl::SubCommand Extract("extract", "Extract instrumentation maps");
static cl::opt ExtractInput(cl::Positional,
cl::desc(""), cl::Required,
cl::sub(Extract));
static cl::opt
ExtractOutput("output", cl::value_desc("output file"), cl::init("-"),
cl::desc("output file; use '-' for stdout"),
cl::sub(Extract));
static cl::alias ExtractOutput2("o", cl::aliasopt(ExtractOutput),
cl::desc("Alias for -output"),
cl::sub(Extract));
struct YAMLXRaySledEntry {
int32_t FuncId;
Hex64 Address;
Hex64 Function;
SledEntry::FunctionKinds Kind;
bool AlwaysInstrument;
};
namespace llvm {
namespace yaml {
template <> struct ScalarEnumerationTraits {
static void enumeration(IO &IO, SledEntry::FunctionKinds &Kind) {
IO.enumCase(Kind, "function-enter", SledEntry::FunctionKinds::ENTRY);
IO.enumCase(Kind, "function-exit", SledEntry::FunctionKinds::EXIT);
IO.enumCase(Kind, "tail-exit", SledEntry::FunctionKinds::TAIL);
}
};
template <> struct MappingTraits {
static void mapping(IO &IO, YAMLXRaySledEntry &Entry) {
IO.mapRequired("id", Entry.FuncId);
IO.mapRequired("address", Entry.Address);
IO.mapRequired("function", Entry.Function);
IO.mapRequired("kind", Entry.Kind);
IO.mapRequired("always-instrument", Entry.AlwaysInstrument);
}
static constexpr bool flow = true;
};
}
}
LLVM_YAML_IS_SEQUENCE_VECTOR(YAMLXRaySledEntry)
namespace {
llvm::Error LoadBinaryInstrELF(
StringRef Filename, std::deque &OutputSleds,
InstrumentationMapExtractor::FunctionAddressMap &InstrMap,
InstrumentationMapExtractor::FunctionAddressReverseMap &FunctionIds) {
auto ObjectFile = object::ObjectFile::createObjectFile(Filename);
if (!ObjectFile)
return ObjectFile.takeError();
// FIXME: Maybe support other ELF formats. For now, 64-bit Little Endian only.
if (!ObjectFile->getBinary()->isELF())
return make_error(
"File format not supported (only does ELF).",
std::make_error_code(std::errc::not_supported));
if (ObjectFile->getBinary()->getArch() != Triple::x86_64)
return make_error(
"File format not supported (only does ELF little endian 64-bit).",
std::make_error_code(std::errc::not_supported));
// Find the section named "xray_instr_map".
StringRef Contents = "";
const auto &Sections = ObjectFile->getBinary()->sections();
auto I = find_if(Sections, [&](object::SectionRef Section) {
StringRef Name = "";
if (Section.getName(Name))
return false;
return Name == "xray_instr_map";
});
if (I == Sections.end())
return make_error(
"Failed to find XRay instrumentation map.",
std::make_error_code(std::errc::not_supported));
if (I->getContents(Contents))
return make_error(
"Failed to get contents of 'xray_instr_map' section.",
std::make_error_code(std::errc::executable_format_error));
// Copy the instrumentation map data into the Sleds data structure.
auto C = Contents.bytes_begin();
static constexpr size_t ELF64SledEntrySize = 32;
if ((C - Contents.bytes_end()) % ELF64SledEntrySize != 0)
return make_error(
"Instrumentation map entries not evenly divisible by size of an XRay "
"sled entry in ELF64.",
std::make_error_code(std::errc::executable_format_error));
int32_t FuncId = 1;
uint64_t CurFn = 0;
std::deque Sleds;
for (; C != Contents.bytes_end(); C += ELF64SledEntrySize) {
DataExtractor Extractor(
StringRef(reinterpret_cast(C), ELF64SledEntrySize), true,
8);
Sleds.push_back({});
auto &Entry = Sleds.back();
uint32_t OffsetPtr = 0;
Entry.Address = Extractor.getU64(&OffsetPtr);
Entry.Function = Extractor.getU64(&OffsetPtr);
auto Kind = Extractor.getU8(&OffsetPtr);
switch (Kind) {
case 0: // ENTRY
Entry.Kind = SledEntry::FunctionKinds::ENTRY;
break;
case 1: // EXIT
Entry.Kind = SledEntry::FunctionKinds::EXIT;
break;
case 2: // TAIL
Entry.Kind = SledEntry::FunctionKinds::TAIL;
break;
default:
return make_error(
Twine("Encountered unknown sled type ") + "'" + Twine(int32_t{Kind}) +
"'.",
std::make_error_code(std::errc::executable_format_error));
}
Entry.AlwaysInstrument = Extractor.getU8(&OffsetPtr) != 0;
// We replicate the function id generation scheme implemented in the runtime
// here. Ideally we should be able to break it out, or output this map from
// the runtime, but that's a design point we can discuss later on. For now,
// we replicate the logic and move on.
if (CurFn == 0) {
CurFn = Entry.Function;
InstrMap[FuncId] = Entry.Function;
FunctionIds[Entry.Function] = FuncId;
}
if (Entry.Function != CurFn) {
++FuncId;
CurFn = Entry.Function;
InstrMap[FuncId] = Entry.Function;
FunctionIds[Entry.Function] = FuncId;
}
}
OutputSleds = std::move(Sleds);
return llvm::Error::success();
}
Error LoadYAMLInstrMap(
StringRef Filename, std::deque &Sleds,
InstrumentationMapExtractor::FunctionAddressMap &InstrMap,
InstrumentationMapExtractor::FunctionAddressReverseMap &FunctionIds) {
int Fd;
if (auto EC = sys::fs::openFileForRead(Filename, Fd))
return make_error(
Twine("Failed opening file '") + Filename + "' for reading.", EC);
uint64_t FileSize;
if (auto EC = sys::fs::file_size(Filename, FileSize))
return make_error(
Twine("Failed getting size of file '") + Filename + "'.", EC);
std::error_code EC;
sys::fs::mapped_file_region MappedFile(
Fd, sys::fs::mapped_file_region::mapmode::readonly, FileSize, 0, EC);
if (EC)
return make_error(
Twine("Failed memory-mapping file '") + Filename + "'.", EC);
std::vector YAMLSleds;
Input In(StringRef(MappedFile.data(), MappedFile.size()));
In >> YAMLSleds;
if (In.error())
return make_error(
Twine("Failed loading YAML document from '") + Filename + "'.",
In.error());
for (const auto &Y : YAMLSleds) {
InstrMap[Y.FuncId] = Y.Function;
FunctionIds[Y.Function] = Y.FuncId;
Sleds.push_back(
SledEntry{Y.Address, Y.Function, Y.Kind, Y.AlwaysInstrument});
}
return Error::success();
}
} // namespace
InstrumentationMapExtractor::InstrumentationMapExtractor(std::string Filename,
InputFormats Format,
Error &EC) {
ErrorAsOutParameter ErrAsOutputParam(&EC);
if (Filename.empty()) {
EC = Error::success();
return;
}
switch (Format) {
case InputFormats::ELF: {
EC = handleErrors(
LoadBinaryInstrELF(Filename, Sleds, FunctionAddresses, FunctionIds),
[&](std::unique_ptr E) {
return joinErrors(
make_error(
Twine("Cannot extract instrumentation map from '") +
Filename + "'.",
std::make_error_code(std::errc::executable_format_error)),
std::move(E));
});
break;
}
case InputFormats::YAML: {
EC = handleErrors(
LoadYAMLInstrMap(Filename, Sleds, FunctionAddresses, FunctionIds),
[&](std::unique_ptr E) {
return joinErrors(
make_error(
Twine("Cannot load YAML instrumentation map from '") +
Filename + "'.",
std::make_error_code(std::errc::executable_format_error)),
std::move(E));
});
break;
}
}
}
void InstrumentationMapExtractor::exportAsYAML(raw_ostream &OS) {
// First we translate the sleds into the YAMLXRaySledEntry objects in a deque.
std::vector YAMLSleds;
YAMLSleds.reserve(Sleds.size());
for (const auto &Sled : Sleds) {
YAMLSleds.push_back({FunctionIds[Sled.Function], Sled.Address,
Sled.Function, Sled.Kind, Sled.AlwaysInstrument});
}
- Output Out(OS);
+ Output Out(OS, nullptr, 0);
Out << YAMLSleds;
}
static CommandRegistration Unused(&Extract, []() -> Error {
Error Err = Error::success();
xray::InstrumentationMapExtractor Extractor(
ExtractInput, InstrumentationMapExtractor::InputFormats::ELF, Err);
if (Err)
return Err;
std::error_code EC;
raw_fd_ostream OS(ExtractOutput, EC, sys::fs::OpenFlags::F_Text);
if (EC)
return make_error(
Twine("Cannot open file '") + ExtractOutput + "' for writing.", EC);
Extractor.exportAsYAML(OS);
return Error::success();
});