Vendor import of llvm release_40 branch r294803:

https://llvm.org/svn/llvm-project/llvm/branches/release_40@294803
This commit is contained in:
dim 2017-02-11 13:25:24 +00:00
parent fd782386ad
commit 43b21f1541
31 changed files with 662 additions and 416 deletions

View File

@ -55,6 +55,12 @@ Non-comprehensive list of changes in this release
* LLVM now handles invariant.group across different basic blocks, which makes
it possible to devirtualize virtual calls inside loops.
* The aggressive dead code elimination phase ("adce") now remove
branches which do not effect program behavior. Loops are retained by
default since they may be infinite but these can also be removed
with LLVM option -adce-remove-loops when the loop body otherwise has
no live operations.
* ... next change ...
.. NOTE
@ -75,6 +81,95 @@ Non-comprehensive list of changes in this release
* Significant build-time and binary-size improvements when compiling with
debug info (-g).
Code Generation Testing
-----------------------
Passes that work on the machine instruction representation can be tested with
the .mir serialization format. ``llc`` supports the ``-run-pass``,
``-stop-after``, ``-stop-before``, ``-start-after``, ``-start-before`` to to
run a single pass of the code generation pipeline, or to stop or start the code
generation pipeline at a given point.
Additional information can be found in the :doc:`MIRLangRef`. The format is
used by the tests ending in ``.mir`` in the ``test/CodeGen`` directory.
This feature is available since 2015. It is used more often lately and was not
mentioned in the release notes yet.
Intrusive list API overhaul
---------------------------
The intrusive list infrastructure was substantially rewritten over the last
couple of releases, primarily to excise undefined behaviour. The biggest
changes landed in this release.
* ``simple_ilist<T>`` is a lower-level intrusive list that never takes
ownership of its nodes. New intrusive-list clients should consider using it
instead of ``ilist<T>``.
* ``ilist_tag<class>`` allows a single data type to be inserted into two
parallel intrusive lists. A type can inherit twice from ``ilist_node``,
first using ``ilist_node<T,ilist_tag<A>>`` (enabling insertion into
``simple_ilist<T,ilist_tag<A>>``) and second using
``ilist_node<T,ilist_tag<B>>`` (enabling insertion into
``simple_ilist<T,ilist_tag<B>>``), where ``A`` and ``B`` are arbitrary
types.
* ``ilist_sentinel_tracking<bool>`` controls whether an iterator knows
whether it's pointing at the sentinel (``end()``). By default, sentinel
tracking is on when ABI-breaking checks are enabled, and off otherwise;
this is used for an assertion when dereferencing ``end()`` (this assertion
triggered often in practice, and many backend bugs were fixed). Explicitly
turning on sentinel tracking also enables ``iterator::isEnd()``. This is
used by ``MachineInstrBundleIterator`` to iterate over bundles.
* ``ilist<T>`` is built on top of ``simple_ilist<T>``, and supports the same
configuration options. As before (and unlike ``simple_ilist<T>``),
``ilist<T>`` takes ownership of its nodes. However, it no longer supports
*allocating* nodes, and is now equivalent to ``iplist<T>``. ``iplist<T>``
will likely be removed in the future.
* ``ilist<T>`` now always uses ``ilist_traits<T>``. Instead of passing a
custom traits class in via a template parameter, clients that want to
customize the traits should specialize ``ilist_traits<T>``. Clients that
want to avoid ownership can specialize ``ilist_alloc_traits<T>`` to inherit
from ``ilist_noalloc_traits<T>`` (or to do something funky); clients that
need callbacks can specialize ``ilist_callback_traits<T>`` directly.
* The underlying data structure is now a simple recursive linked list. The
sentinel node contains only a "next" (``begin()``) and "prev" (``rbegin()``)
pointer and is stored in the same allocation as ``simple_ilist<T>``.
Previously, it was malloc-allocated on-demand by default, although the
now-defunct ``ilist_sentinel_traits<T>`` was sometimes specialized to avoid
this.
* The ``reverse_iterator`` class no longer uses ``std::reverse_iterator``.
Instead, it now has a handle to the same node that it dereferences to.
Reverse iterators now have the same iterator invalidation semantics as
forward iterators.
* ``iterator`` and ``reverse_iterator`` have explicit conversion constructors
that match ``std::reverse_iterator``'s off-by-one semantics, so that
reversing the end points of an iterator range results in the same range
(albeit in reverse). I.e., ``reverse_iterator(begin())`` equals
``rend()``.
* ``iterator::getReverse()`` and ``reverse_iterator::getReverse()`` return an
iterator that dereferences to the *same* node. I.e.,
``begin().getReverse()`` equals ``--rend()``.
* ``ilist_node<T>::getIterator()`` and
``ilist_node<T>::getReverseIterator()`` return the forward and reverse
iterators that dereference to the current node. I.e.,
``begin()->getIterator()`` equals ``begin()`` and
``rbegin()->getReverseIterator()`` equals ``rbegin()``.
* ``iterator`` now stores an ``ilist_node_base*`` instead of a ``T*``. The
implicit conversions between ``ilist<T>::iterator`` and ``T*`` have been
removed. Clients may use ``N->getIterator()`` (if not ``nullptr``) or
``&*I`` (if not ``end()``); alternatively, clients may refactor to use
references for known-good nodes.
Changes to the LLVM IR
----------------------
@ -133,9 +228,23 @@ Changes to the AMDGPU Target
Changes to the AVR Target
-----------------------------
* The entire backend has been merged in-tree with all tests passing. All of
the instruction selection code and the machine code backend has landed
recently and is fully usable.
This marks the first release where the AVR backend has been completely merged
from a fork into LLVM trunk. The backend is still marked experimental, but
is generally quite usable. All downstream development has halted on
`GitHub <https://github.com/avr-llvm/llvm>`_, and changes now go directly into
LLVM trunk.
* Instruction selector and pseudo instruction expansion pass landed
* `read_register` and `write_register` intrinsics are now supported
* Support stack stores greater than 63-bytes from the bottom of the stack
* A number of assertion errors have been fixed
* Support stores to `undef` locations
* Very basic support for the target has been added to clang
* Small optimizations to some 16-bit boolean expressions
Most of the work behind the scenes has been on correctness of generated
assembly, and also fixing some assertions we would hit on some well-formed
inputs.
Changes to the OCaml bindings
-----------------------------

View File

@ -48,9 +48,9 @@ copyright = u'2003-%d, LLVM Project' % date.today().year
# built documents.
#
# The short X.Y version.
version = '4.0'
version = '4'
# The full version, including alpha/beta/rc tags.
release = '4.0'
release = '4'
# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.

View File

@ -102,10 +102,23 @@ public:
return *this;
}
/// Convert from an iterator to its reverse.
/// Explicit conversion between forward/reverse iterators.
///
/// TODO: Roll this into the implicit constructor once we're sure that no one
/// is relying on the std::reverse_iterator off-by-one semantics.
/// Translate between forward and reverse iterators without changing range
/// boundaries. The resulting iterator will dereference (and have a handle)
/// to the previous node, which is somewhat unexpected; but converting the
/// two endpoints in a range will give the same range in reverse.
///
/// This matches std::reverse_iterator conversions.
explicit ilist_iterator(
const ilist_iterator<OptionsT, !IsReverse, IsConst> &RHS)
: ilist_iterator(++RHS.getReverse()) {}
/// Get a reverse iterator to the same node.
///
/// Gives a reverse iterator that will dereference (and have a handle) to the
/// same node. Converting the endpoint iterators in a range will give a
/// different range; for range operations, use the explicit conversions.
ilist_iterator<OptionsT, !IsReverse, IsConst> getReverse() const {
if (NodePtr)
return ilist_iterator<OptionsT, !IsReverse, IsConst>(*NodePtr);

View File

@ -153,6 +153,18 @@ public:
: MII(I.getInstrIterator()) {}
MachineInstrBundleIterator() : MII(nullptr) {}
/// Explicit conversion between forward/reverse iterators.
///
/// Translate between forward and reverse iterators without changing range
/// boundaries. The resulting iterator will dereference (and have a handle)
/// to the previous node, which is somewhat unexpected; but converting the
/// two endpoints in a range will give the same range in reverse.
///
/// This matches std::reverse_iterator conversions.
explicit MachineInstrBundleIterator(
const MachineInstrBundleIterator<Ty, !IsReverse> &I)
: MachineInstrBundleIterator(++I.getReverse()) {}
/// Get the bundle iterator for the given instruction's bundle.
static MachineInstrBundleIterator getAtBundleBegin(instr_iterator MI) {
return MachineInstrBundleIteratorHelper<IsReverse>::getBundleBegin(MI);
@ -258,6 +270,11 @@ public:
nonconst_iterator getNonConstIterator() const { return MII.getNonConst(); }
/// Get a reverse iterator to the same node.
///
/// Gives a reverse iterator that will dereference (and have a handle) to the
/// same node. Converting the endpoint iterators in a range will give a
/// different range; for range operations, use the explicit conversions.
reverse_iterator getReverse() const { return MII.getReverse(); }
};

View File

@ -311,6 +311,8 @@ template <typename IRUnitT, typename... ExtraArgTs> class AnalysisManager;
template <typename DerivedT> struct PassInfoMixin {
/// Gets the name of the pass we are mixed into.
static StringRef name() {
static_assert(std::is_base_of<PassInfoMixin, DerivedT>::value,
"Must pass the derived type as the template argument!");
StringRef Name = getTypeName<DerivedT>();
if (Name.startswith("llvm::"))
Name = Name.drop_front(strlen("llvm::"));
@ -339,7 +341,11 @@ struct AnalysisInfoMixin : PassInfoMixin<DerivedT> {
/// known platform with this limitation is Windows DLL builds, specifically
/// building each part of LLVM as a DLL. If we ever remove that build
/// configuration, this mixin can provide the static key as well.
static AnalysisKey *ID() { return &DerivedT::Key; }
static AnalysisKey *ID() {
static_assert(std::is_base_of<AnalysisInfoMixin, DerivedT>::value,
"Must pass the derived type as the template argument!");
return &DerivedT::Key;
}
};
/// This templated class represents "all analyses that operate over \<a
@ -1010,7 +1016,7 @@ extern template class InnerAnalysisManagerProxy<FunctionAnalysisManager,
template <typename AnalysisManagerT, typename IRUnitT, typename... ExtraArgTs>
class OuterAnalysisManagerProxy
: public AnalysisInfoMixin<
OuterAnalysisManagerProxy<AnalysisManagerT, IRUnitT>> {
OuterAnalysisManagerProxy<AnalysisManagerT, IRUnitT, ExtraArgTs...>> {
public:
/// \brief Result proxy object for \c OuterAnalysisManagerProxy.
class Result {
@ -1072,7 +1078,7 @@ public:
private:
friend AnalysisInfoMixin<
OuterAnalysisManagerProxy<AnalysisManagerT, IRUnitT>>;
OuterAnalysisManagerProxy<AnalysisManagerT, IRUnitT, ExtraArgTs...>>;
static AnalysisKey Key;
const AnalysisManagerT *AM;

View File

@ -1108,25 +1108,6 @@ public:
/// terminator instruction that has not been predicated.
virtual bool isUnpredicatedTerminator(const MachineInstr &MI) const;
/// Returns true if MI is an unconditional tail call.
virtual bool isUnconditionalTailCall(const MachineInstr &MI) const {
return false;
}
/// Returns true if the tail call can be made conditional on BranchCond.
virtual bool
canMakeTailCallConditional(SmallVectorImpl<MachineOperand> &Cond,
const MachineInstr &TailCall) const {
return false;
}
/// Replace the conditional branch in MBB with a conditional tail call.
virtual void replaceBranchWithTailCall(MachineBasicBlock &MBB,
SmallVectorImpl<MachineOperand> &Cond,
const MachineInstr &TailCall) const {
llvm_unreachable("Target didn't implement replaceBranchWithTailCall!");
}
/// Convert the instruction into a predicated instruction.
/// It returns true if the operation was successful.
virtual bool PredicateInstruction(MachineInstr &MI,

View File

@ -448,6 +448,7 @@ class MetadataLoader::MetadataLoaderImpl {
bool StripTBAA = false;
bool HasSeenOldLoopTags = false;
bool NeedUpgradeToDIGlobalVariableExpression = false;
/// True if metadata is being parsed for a module being ThinLTO imported.
bool IsImporting = false;
@ -473,6 +474,45 @@ class MetadataLoader::MetadataLoaderImpl {
CUSubprograms.clear();
}
/// Upgrade old-style bare DIGlobalVariables to DIGlobalVariableExpressions.
void upgradeCUVariables() {
if (!NeedUpgradeToDIGlobalVariableExpression)
return;
// Upgrade list of variables attached to the CUs.
if (NamedMDNode *CUNodes = TheModule.getNamedMetadata("llvm.dbg.cu"))
for (unsigned I = 0, E = CUNodes->getNumOperands(); I != E; ++I) {
auto *CU = cast<DICompileUnit>(CUNodes->getOperand(I));
if (auto *GVs = dyn_cast_or_null<MDTuple>(CU->getRawGlobalVariables()))
for (unsigned I = 0; I < GVs->getNumOperands(); I++)
if (auto *GV =
dyn_cast_or_null<DIGlobalVariable>(GVs->getOperand(I))) {
auto *DGVE =
DIGlobalVariableExpression::getDistinct(Context, GV, nullptr);
GVs->replaceOperandWith(I, DGVE);
}
}
// Upgrade variables attached to globals.
for (auto &GV : TheModule.globals()) {
SmallVector<MDNode *, 1> MDs, NewMDs;
GV.getMetadata(LLVMContext::MD_dbg, MDs);
GV.eraseMetadata(LLVMContext::MD_dbg);
for (auto *MD : MDs)
if (auto *DGV = dyn_cast_or_null<DIGlobalVariable>(MD)) {
auto *DGVE =
DIGlobalVariableExpression::getDistinct(Context, DGV, nullptr);
GV.addMetadata(LLVMContext::MD_dbg, *DGVE);
} else
GV.addMetadata(LLVMContext::MD_dbg, *MD);
}
}
void upgradeDebugInfo() {
upgradeCUSubprograms();
upgradeCUVariables();
}
public:
MetadataLoaderImpl(BitstreamCursor &Stream, Module &TheModule,
BitcodeReaderValueList &ValueList,
@ -726,7 +766,7 @@ Error MetadataLoader::MetadataLoaderImpl::parseMetadata(bool ModuleLevel) {
// Reading the named metadata created forward references and/or
// placeholders, that we flush here.
resolveForwardRefsAndPlaceholders(Placeholders);
upgradeCUSubprograms();
upgradeDebugInfo();
// Return at the beginning of the block, since it is easy to skip it
// entirely from there.
Stream.ReadBlockEnd(); // Pop the abbrev block context.
@ -750,7 +790,7 @@ Error MetadataLoader::MetadataLoaderImpl::parseMetadata(bool ModuleLevel) {
return error("Malformed block");
case BitstreamEntry::EndBlock:
resolveForwardRefsAndPlaceholders(Placeholders);
upgradeCUSubprograms();
upgradeDebugInfo();
return Error::success();
case BitstreamEntry::Record:
// The interesting case.
@ -1420,11 +1460,17 @@ Error MetadataLoader::MetadataLoaderImpl::parseOneMetadata(
getDITypeRefOrNull(Record[6]), Record[7], Record[8],
getMDOrNull(Record[10]), AlignInBits));
auto *DGVE = DIGlobalVariableExpression::getDistinct(Context, DGV, Expr);
MetadataList.assignValue(DGVE, NextMetadataNo);
NextMetadataNo++;
DIGlobalVariableExpression *DGVE = nullptr;
if (Attach || Expr)
DGVE = DIGlobalVariableExpression::getDistinct(Context, DGV, Expr);
else
NeedUpgradeToDIGlobalVariableExpression = true;
if (Attach)
Attach->addDebugInfo(DGVE);
auto *MDNode = Expr ? cast<Metadata>(DGVE) : cast<Metadata>(DGV);
MetadataList.assignValue(MDNode, NextMetadataNo);
NextMetadataNo++;
} else
return error("Invalid record");

View File

@ -49,7 +49,6 @@ STATISTIC(NumDeadBlocks, "Number of dead blocks removed");
STATISTIC(NumBranchOpts, "Number of branches optimized");
STATISTIC(NumTailMerge , "Number of block tails merged");
STATISTIC(NumHoist , "Number of times common instructions are hoisted");
STATISTIC(NumTailCalls, "Number of tail calls optimized");
static cl::opt<cl::boolOrDefault> FlagEnableTailMerge("enable-tail-merge",
cl::init(cl::BOU_UNSET), cl::Hidden);
@ -1387,42 +1386,6 @@ ReoptimizeBlock:
}
}
if (!IsEmptyBlock(MBB) && MBB->pred_size() == 1 &&
MF.getFunction()->optForSize()) {
// Changing "Jcc foo; foo: jmp bar;" into "Jcc bar;" might change the branch
// direction, thereby defeating careful block placement and regressing
// performance. Therefore, only consider this for optsize functions.
MachineInstr &TailCall = *MBB->getFirstNonDebugInstr();
if (TII->isUnconditionalTailCall(TailCall)) {
MachineBasicBlock *Pred = *MBB->pred_begin();
MachineBasicBlock *PredTBB = nullptr, *PredFBB = nullptr;
SmallVector<MachineOperand, 4> PredCond;
bool PredAnalyzable =
!TII->analyzeBranch(*Pred, PredTBB, PredFBB, PredCond, true);
if (PredAnalyzable && !PredCond.empty() && PredTBB == MBB) {
// The predecessor has a conditional branch to this block which consists
// of only a tail call. Try to fold the tail call into the conditional
// branch.
if (TII->canMakeTailCallConditional(PredCond, TailCall)) {
// TODO: It would be nice if analyzeBranch() could provide a pointer
// to the branch insturction so replaceBranchWithTailCall() doesn't
// have to search for it.
TII->replaceBranchWithTailCall(*Pred, PredCond, TailCall);
++NumTailCalls;
Pred->removeSuccessor(MBB);
MadeChange = true;
return MadeChange;
}
}
// If the predecessor is falling through to this block, we could reverse
// the branch condition and fold the tail call into that. However, after
// that we might have to re-arrange the CFG to fall through to the other
// block and there is a high risk of regressing code size rather than
// improving it.
}
}
// Analyze the branch in the current block.
MachineBasicBlock *CurTBB = nullptr, *CurFBB = nullptr;
SmallVector<MachineOperand, 4> CurCond;

View File

@ -61,6 +61,7 @@ namespace {
private:
void ClobberRegister(unsigned Reg);
void ReadRegister(unsigned Reg);
void CopyPropagateBlock(MachineBasicBlock &MBB);
bool eraseIfRedundant(MachineInstr &Copy, unsigned Src, unsigned Def);
@ -120,6 +121,18 @@ void MachineCopyPropagation::ClobberRegister(unsigned Reg) {
}
}
void MachineCopyPropagation::ReadRegister(unsigned Reg) {
// If 'Reg' is defined by a copy, the copy is no longer a candidate
// for elimination.
for (MCRegAliasIterator AI(Reg, TRI, true); AI.isValid(); ++AI) {
Reg2MIMap::iterator CI = CopyMap.find(*AI);
if (CI != CopyMap.end()) {
DEBUG(dbgs() << "MCP: Copy is used - not dead: "; CI->second->dump());
MaybeDeadCopies.remove(CI->second);
}
}
}
/// Return true if \p PreviousCopy did copy register \p Src to register \p Def.
/// This fact may have been obscured by sub register usage or may not be true at
/// all even though Src and Def are subregisters of the registers used in
@ -212,12 +225,14 @@ void MachineCopyPropagation::CopyPropagateBlock(MachineBasicBlock &MBB) {
// If Src is defined by a previous copy, the previous copy cannot be
// eliminated.
for (MCRegAliasIterator AI(Src, TRI, true); AI.isValid(); ++AI) {
Reg2MIMap::iterator CI = CopyMap.find(*AI);
if (CI != CopyMap.end()) {
DEBUG(dbgs() << "MCP: Copy is no longer dead: "; CI->second->dump());
MaybeDeadCopies.remove(CI->second);
}
ReadRegister(Src);
for (const MachineOperand &MO : MI->implicit_operands()) {
if (!MO.isReg() || !MO.readsReg())
continue;
unsigned Reg = MO.getReg();
if (!Reg)
continue;
ReadRegister(Reg);
}
DEBUG(dbgs() << "MCP: Copy is a deletion candidate: "; MI->dump());
@ -234,6 +249,14 @@ void MachineCopyPropagation::CopyPropagateBlock(MachineBasicBlock &MBB) {
// ...
// %xmm2<def> = copy %xmm9
ClobberRegister(Def);
for (const MachineOperand &MO : MI->implicit_operands()) {
if (!MO.isReg() || !MO.isDef())
continue;
unsigned Reg = MO.getReg();
if (!Reg)
continue;
ClobberRegister(Reg);
}
// Remember Def is defined by the copy.
for (MCSubRegIterator SR(Def, TRI, /*IncludeSelf=*/true); SR.isValid();
@ -268,17 +291,8 @@ void MachineCopyPropagation::CopyPropagateBlock(MachineBasicBlock &MBB) {
if (MO.isDef()) {
Defs.push_back(Reg);
continue;
}
// If 'Reg' is defined by a copy, the copy is no longer a candidate
// for elimination.
for (MCRegAliasIterator AI(Reg, TRI, true); AI.isValid(); ++AI) {
Reg2MIMap::iterator CI = CopyMap.find(*AI);
if (CI != CopyMap.end()) {
DEBUG(dbgs() << "MCP: Copy is used - not dead: "; CI->second->dump());
MaybeDeadCopies.remove(CI->second);
}
} else {
ReadRegister(Reg);
}
// Treat undef use like defs for copy propagation but not for
// dead copy. We would need to do a liveness check to be sure the copy

View File

@ -1556,9 +1556,10 @@ bool RegisterCoalescer::joinCopy(MachineInstr *CopyMI, bool &Again) {
bool RegisterCoalescer::joinReservedPhysReg(CoalescerPair &CP) {
unsigned DstReg = CP.getDstReg();
unsigned SrcReg = CP.getSrcReg();
assert(CP.isPhys() && "Must be a physreg copy");
assert(MRI->isReserved(DstReg) && "Not a reserved register");
LiveInterval &RHS = LIS->getInterval(CP.getSrcReg());
LiveInterval &RHS = LIS->getInterval(SrcReg);
DEBUG(dbgs() << "\t\tRHS = " << RHS << '\n');
assert(RHS.containsOneValue() && "Invalid join with reserved register");
@ -1592,17 +1593,36 @@ bool RegisterCoalescer::joinReservedPhysReg(CoalescerPair &CP) {
// Delete the identity copy.
MachineInstr *CopyMI;
if (CP.isFlipped()) {
CopyMI = MRI->getVRegDef(RHS.reg);
// Physreg is copied into vreg
// %vregY = COPY %X
// ... //< no other def of %X here
// use %vregY
// =>
// ...
// use %X
CopyMI = MRI->getVRegDef(SrcReg);
} else {
if (!MRI->hasOneNonDBGUse(RHS.reg)) {
// VReg is copied into physreg:
// %vregX = def
// ... //< no other def or use of %Y here
// %Y = COPY %vregX
// =>
// %Y = def
// ...
if (!MRI->hasOneNonDBGUse(SrcReg)) {
DEBUG(dbgs() << "\t\tMultiple vreg uses!\n");
return false;
}
MachineInstr *DestMI = MRI->getVRegDef(RHS.reg);
CopyMI = &*MRI->use_instr_nodbg_begin(RHS.reg);
const SlotIndex CopyRegIdx = LIS->getInstructionIndex(*CopyMI).getRegSlot();
const SlotIndex DestRegIdx = LIS->getInstructionIndex(*DestMI).getRegSlot();
if (!LIS->intervalIsInOneMBB(RHS)) {
DEBUG(dbgs() << "\t\tComplex control flow!\n");
return false;
}
MachineInstr &DestMI = *MRI->getVRegDef(SrcReg);
CopyMI = &*MRI->use_instr_nodbg_begin(SrcReg);
SlotIndex CopyRegIdx = LIS->getInstructionIndex(*CopyMI).getRegSlot();
SlotIndex DestRegIdx = LIS->getInstructionIndex(DestMI).getRegSlot();
if (!MRI->isConstantPhysReg(DstReg)) {
// We checked above that there are no interfering defs of the physical
@ -1629,8 +1649,8 @@ bool RegisterCoalescer::joinReservedPhysReg(CoalescerPair &CP) {
// We're going to remove the copy which defines a physical reserved
// register, so remove its valno, etc.
DEBUG(dbgs() << "\t\tRemoving phys reg def of " << DstReg << " at "
<< CopyRegIdx << "\n");
DEBUG(dbgs() << "\t\tRemoving phys reg def of " << PrintReg(DstReg, TRI)
<< " at " << CopyRegIdx << "\n");
LIS->removePhysRegDefAt(DstReg, CopyRegIdx);
// Create a new dead def at the new def location.

View File

@ -509,17 +509,17 @@ void CodeViewContext::encodeDefRange(MCAsmLayout &Layout,
// are artificially constructing.
size_t RecordSize = FixedSizePortion.size() +
sizeof(LocalVariableAddrRange) + 4 * NumGaps;
// Write out the recrod size.
support::endian::Writer<support::little>(OS).write<uint16_t>(RecordSize);
// Write out the record size.
LEWriter.write<uint16_t>(RecordSize);
// Write out the fixed size prefix.
OS << FixedSizePortion;
// Make space for a fixup that will eventually have a section relative
// relocation pointing at the offset where the variable becomes live.
Fixups.push_back(MCFixup::create(Contents.size(), BE, FK_SecRel_4));
Contents.resize(Contents.size() + 4); // Fixup for code start.
LEWriter.write<uint32_t>(0); // Fixup for code start.
// Make space for a fixup that will record the section index for the code.
Fixups.push_back(MCFixup::create(Contents.size(), BE, FK_SecRel_2));
Contents.resize(Contents.size() + 2); // Fixup for section index.
LEWriter.write<uint16_t>(0); // Fixup for section index.
// Write down the range's extent.
LEWriter.write<uint16_t>(Chunk);
@ -529,7 +529,7 @@ void CodeViewContext::encodeDefRange(MCAsmLayout &Layout,
} while (RangeSize > 0);
// Emit the gaps afterwards.
assert((NumGaps == 0 || Bias < MaxDefRange) &&
assert((NumGaps == 0 || Bias <= MaxDefRange) &&
"large ranges should not have gaps");
unsigned GapStartOffset = GapAndRangeSizes[I].second;
for (++I; I != J; ++I) {
@ -537,7 +537,7 @@ void CodeViewContext::encodeDefRange(MCAsmLayout &Layout,
assert(I < GapAndRangeSizes.size());
std::tie(GapSize, RangeSize) = GapAndRangeSizes[I];
LEWriter.write<uint16_t>(GapStartOffset);
LEWriter.write<uint16_t>(RangeSize);
LEWriter.write<uint16_t>(GapSize);
GapStartOffset += GapSize + RangeSize;
}
}

View File

@ -8934,8 +8934,9 @@ static SDValue splitStoreSplat(SelectionDAG &DAG, StoreSDNode &St,
// instructions (stp).
SDLoc DL(&St);
SDValue BasePtr = St.getBasePtr();
const MachinePointerInfo &PtrInfo = St.getPointerInfo();
SDValue NewST1 =
DAG.getStore(St.getChain(), DL, SplatVal, BasePtr, St.getPointerInfo(),
DAG.getStore(St.getChain(), DL, SplatVal, BasePtr, PtrInfo,
OrigAlignment, St.getMemOperand()->getFlags());
unsigned Offset = EltOffset;
@ -8944,7 +8945,7 @@ static SDValue splitStoreSplat(SelectionDAG &DAG, StoreSDNode &St,
SDValue OffsetPtr = DAG.getNode(ISD::ADD, DL, MVT::i64, BasePtr,
DAG.getConstant(Offset, DL, MVT::i64));
NewST1 = DAG.getStore(NewST1.getValue(0), DL, SplatVal, OffsetPtr,
St.getPointerInfo(), Alignment,
PtrInfo.getWithOffset(Offset), Alignment,
St.getMemOperand()->getFlags());
Offset += EltOffset;
}

View File

@ -77,11 +77,9 @@ bool X86ExpandPseudo::ExpandMI(MachineBasicBlock &MBB,
default:
return false;
case X86::TCRETURNdi:
case X86::TCRETURNdicc:
case X86::TCRETURNri:
case X86::TCRETURNmi:
case X86::TCRETURNdi64:
case X86::TCRETURNdi64cc:
case X86::TCRETURNri64:
case X86::TCRETURNmi64: {
bool isMem = Opcode == X86::TCRETURNmi || Opcode == X86::TCRETURNmi64;
@ -99,10 +97,6 @@ bool X86ExpandPseudo::ExpandMI(MachineBasicBlock &MBB,
Offset = StackAdj - MaxTCDelta;
assert(Offset >= 0 && "Offset should never be negative");
if (Opcode == X86::TCRETURNdicc || Opcode == X86::TCRETURNdi64cc) {
assert(Offset == 0 && "Conditional tail call cannot adjust the stack.");
}
if (Offset) {
// Check for possible merge with preceding ADD instruction.
Offset += X86FL->mergeSPUpdates(MBB, MBBI, true);
@ -111,21 +105,12 @@ bool X86ExpandPseudo::ExpandMI(MachineBasicBlock &MBB,
// Jump to label or value in register.
bool IsWin64 = STI->isTargetWin64();
if (Opcode == X86::TCRETURNdi || Opcode == X86::TCRETURNdicc ||
Opcode == X86::TCRETURNdi64 || Opcode == X86::TCRETURNdi64cc) {
if (Opcode == X86::TCRETURNdi || Opcode == X86::TCRETURNdi64) {
unsigned Op;
switch (Opcode) {
case X86::TCRETURNdi:
Op = X86::TAILJMPd;
break;
case X86::TCRETURNdicc:
Op = X86::TAILJMPd_CC;
break;
case X86::TCRETURNdi64cc:
assert(!IsWin64 && "Conditional tail calls confuse the Win64 unwinder.");
// TODO: We could do it for Win64 "leaf" functions though; PR30337.
Op = X86::TAILJMPd64_CC;
break;
default:
// Note: Win64 uses REX prefixes indirect jumps out of functions, but
// not direct ones.
@ -141,10 +126,6 @@ bool X86ExpandPseudo::ExpandMI(MachineBasicBlock &MBB,
MIB.addExternalSymbol(JumpTarget.getSymbolName(),
JumpTarget.getTargetFlags());
}
if (Op == X86::TAILJMPd_CC || Op == X86::TAILJMPd64_CC) {
MIB.addImm(MBBI->getOperand(2).getImm());
}
} else if (Opcode == X86::TCRETURNmi || Opcode == X86::TCRETURNmi64) {
unsigned Op = (Opcode == X86::TCRETURNmi)
? X86::TAILJMPm

View File

@ -264,21 +264,6 @@ let isCall = 1, isTerminator = 1, isReturn = 1, isBarrier = 1,
"jmp{l}\t{*}$dst", [], IIC_JMP_MEM>;
}
// Conditional tail calls are similar to the above, but they are branches
// rather than barriers, and they use EFLAGS.
let isCall = 1, isTerminator = 1, isReturn = 1, isBranch = 1,
isCodeGenOnly = 1, SchedRW = [WriteJumpLd] in
let Uses = [ESP, EFLAGS] in {
def TCRETURNdicc : PseudoI<(outs),
(ins i32imm_pcrel:$dst, i32imm:$offset, i32imm:$cond), []>;
// This gets substituted to a conditional jump instruction in MC lowering.
def TAILJMPd_CC : Ii32PCRel<0x80, RawFrm, (outs),
(ins i32imm_pcrel:$dst, i32imm:$cond),
"",
[], IIC_JMP_REL>;
}
//===----------------------------------------------------------------------===//
// Call Instructions...
@ -340,19 +325,3 @@ let isCall = 1, isTerminator = 1, isReturn = 1, isBarrier = 1,
"rex64 jmp{q}\t{*}$dst", [], IIC_JMP_MEM>;
}
}
// Conditional tail calls are similar to the above, but they are branches
// rather than barriers, and they use EFLAGS.
let isCall = 1, isTerminator = 1, isReturn = 1, isBranch = 1,
isCodeGenOnly = 1, SchedRW = [WriteJumpLd] in
let Uses = [RSP, EFLAGS] in {
def TCRETURNdi64cc : PseudoI<(outs),
(ins i64i32imm_pcrel:$dst, i32imm:$offset,
i32imm:$cond), []>;
// This gets substituted to a conditional jump instruction in MC lowering.
def TAILJMPd64_CC : Ii32PCRel<0x80, RawFrm, (outs),
(ins i64i32imm_pcrel:$dst, i32imm:$cond),
"",
[], IIC_JMP_REL>;
}

View File

@ -5108,85 +5108,6 @@ bool X86InstrInfo::isUnpredicatedTerminator(const MachineInstr &MI) const {
return !isPredicated(MI);
}
bool X86InstrInfo::isUnconditionalTailCall(const MachineInstr &MI) const {
switch (MI.getOpcode()) {
case X86::TCRETURNdi:
case X86::TCRETURNri:
case X86::TCRETURNmi:
case X86::TCRETURNdi64:
case X86::TCRETURNri64:
case X86::TCRETURNmi64:
return true;
default:
return false;
}
}
bool X86InstrInfo::canMakeTailCallConditional(
SmallVectorImpl<MachineOperand> &BranchCond,
const MachineInstr &TailCall) const {
if (TailCall.getOpcode() != X86::TCRETURNdi &&
TailCall.getOpcode() != X86::TCRETURNdi64) {
// Only direct calls can be done with a conditional branch.
return false;
}
if (Subtarget.isTargetWin64()) {
// Conditional tail calls confuse the Win64 unwinder.
// TODO: Allow them for "leaf" functions; PR30337.
return false;
}
assert(BranchCond.size() == 1);
if (BranchCond[0].getImm() > X86::LAST_VALID_COND) {
// Can't make a conditional tail call with this condition.
return false;
}
const X86MachineFunctionInfo *X86FI =
TailCall.getParent()->getParent()->getInfo<X86MachineFunctionInfo>();
if (X86FI->getTCReturnAddrDelta() != 0 ||
TailCall.getOperand(1).getImm() != 0) {
// A conditional tail call cannot do any stack adjustment.
return false;
}
return true;
}
void X86InstrInfo::replaceBranchWithTailCall(
MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &BranchCond,
const MachineInstr &TailCall) const {
assert(canMakeTailCallConditional(BranchCond, TailCall));
MachineBasicBlock::iterator I = MBB.end();
while (I != MBB.begin()) {
--I;
if (I->isDebugValue())
continue;
if (!I->isBranch())
assert(0 && "Can't find the branch to replace!");
X86::CondCode CC = getCondFromBranchOpc(I->getOpcode());
assert(BranchCond.size() == 1);
if (CC != BranchCond[0].getImm())
continue;
break;
}
unsigned Opc = TailCall.getOpcode() == X86::TCRETURNdi ? X86::TCRETURNdicc
: X86::TCRETURNdi64cc;
auto MIB = BuildMI(MBB, I, MBB.findDebugLoc(I), get(Opc));
MIB->addOperand(TailCall.getOperand(0)); // Destination.
MIB.addImm(0); // Stack offset (not used).
MIB->addOperand(BranchCond[0]); // Condition.
MIB.copyImplicitOps(TailCall); // Regmask and (imp-used) parameters.
I->eraseFromParent();
}
// Given a MBB and its TBB, find the FBB which was a fallthrough MBB (it may
// not be a fallthrough MBB now due to layout changes). Return nullptr if the
// fallthrough MBB cannot be identified.

View File

@ -316,13 +316,6 @@ public:
// Branch analysis.
bool isUnpredicatedTerminator(const MachineInstr &MI) const override;
bool isUnconditionalTailCall(const MachineInstr &MI) const override;
bool canMakeTailCallConditional(SmallVectorImpl<MachineOperand> &Cond,
const MachineInstr &TailCall) const override;
void replaceBranchWithTailCall(MachineBasicBlock &MBB,
SmallVectorImpl<MachineOperand> &Cond,
const MachineInstr &TailCall) const override;
bool analyzeBranch(MachineBasicBlock &MBB, MachineBasicBlock *&TBB,
MachineBasicBlock *&FBB,
SmallVectorImpl<MachineOperand> &Cond,

View File

@ -498,16 +498,11 @@ ReSimplify:
break;
}
// TAILJMPd, TAILJMPd64, TailJMPd_cc - Lower to the correct jump instruction.
// TAILJMPd, TAILJMPd64 - Lower to the correct jump instruction.
{ unsigned Opcode;
case X86::TAILJMPr: Opcode = X86::JMP32r; goto SetTailJmpOpcode;
case X86::TAILJMPd:
case X86::TAILJMPd64: Opcode = X86::JMP_1; goto SetTailJmpOpcode;
case X86::TAILJMPd_CC:
case X86::TAILJMPd64_CC:
Opcode = X86::GetCondBranchFromCond(
static_cast<X86::CondCode>(MI->getOperand(1).getImm()));
goto SetTailJmpOpcode;
SetTailJmpOpcode:
MCOperand Saved = OutMI.getOperand(0);
@ -1281,11 +1276,9 @@ void X86AsmPrinter::EmitInstruction(const MachineInstr *MI) {
case X86::TAILJMPr:
case X86::TAILJMPm:
case X86::TAILJMPd:
case X86::TAILJMPd_CC:
case X86::TAILJMPr64:
case X86::TAILJMPm64:
case X86::TAILJMPd64:
case X86::TAILJMPd64_CC:
case X86::TAILJMPr64_REX:
case X86::TAILJMPm64_REX:
// Lower these as normal, but add some comments.

View File

@ -7,12 +7,16 @@
; CHECK: @h = common global i32 0, align 4, !dbg ![[H:[0-9]+]]
; CHECK: ![[G]] = {{.*}}!DIGlobalVariableExpression(var: ![[GVAR:[0-9]+]], expr: ![[GEXPR:[0-9]+]])
; CHECK: ![[GVAR]] = distinct !DIGlobalVariable(name: "g",
; CHECK: DICompileUnit({{.*}}, imports: ![[IMPORTS:[0-9]+]]
; CHECK: !DIGlobalVariableExpression(var: ![[CVAR:[0-9]+]], expr: ![[CEXPR:[0-9]+]])
; CHECK: ![[CVAR]] = distinct !DIGlobalVariable(name: "c",
; CHECK: ![[CEXPR]] = !DIExpression(DW_OP_constu, 23, DW_OP_stack_value)
; CHECK: ![[H]] = {{.*}}!DIGlobalVariableExpression(var: ![[HVAR:[0-9]+]])
; CHECK: ![[HVAR]] = distinct !DIGlobalVariable(name: "h",
; CHECK: ![[HVAR:[0-9]+]] = distinct !DIGlobalVariable(name: "h",
; CHECK: ![[IMPORTS]] = !{![[CIMPORT:[0-9]+]]}
; CHECK: ![[CIMPORT]] = !DIImportedEntity({{.*}}entity: ![[HVAR]]
; CHECK: ![[GEXPR]] = !DIExpression(DW_OP_plus, 1)
; CHECK: ![[H]] = {{.*}}!DIGlobalVariableExpression(var: ![[HVAR]])
@g = common global i32 0, align 4, !dbg !0
@h = common global i32 0, align 4, !dbg !11
@ -21,9 +25,9 @@
!llvm.ident = !{!9}
!0 = distinct !DIGlobalVariable(name: "g", scope: !1, file: !2, line: 1, type: !5, isLocal: false, isDefinition: true, expr: !DIExpression(DW_OP_plus, 1))
!1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2, producer: "clang version 4.0.0 (trunk 286129) (llvm/trunk 286128)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !3, globals: !4)
!1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2, producer: "clang version 4.0.0 (trunk 286129) (llvm/trunk 286128)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, globals: !4, imports: !3)
!2 = !DIFile(filename: "a.c", directory: "/")
!3 = !{}
!3 = !{!12}
!4 = !{!0, !10, !11}
!5 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!6 = !{i32 2, !"Dwarf Version", i32 4}
@ -32,3 +36,4 @@
!9 = !{!"clang version 4.0.0 (trunk 286129) (llvm/trunk 286128)"}
!10 = distinct !DIGlobalVariable(name: "c", scope: !1, file: !2, line: 1, type: !5, isLocal: false, isDefinition: true, expr: !DIExpression(DW_OP_constu, 23, DW_OP_stack_value))
!11 = distinct !DIGlobalVariable(name: "h", scope: !1, file: !2, line: 2, type: !5, isLocal: false, isDefinition: true)
!12 = !DIImportedEntity(tag: DW_TAG_imported_declaration, line: 1, scope: !1, entity: !11)

View File

@ -18,14 +18,13 @@
; CHECK-NEXT: !7 = !DILocalVariable(name: "V1", scope: !6, type: !2)
; CHECK-NEXT: !8 = !DIObjCProperty(name: "P1", type: !1)
; CHECK-NEXT: !9 = !DITemplateTypeParameter(type: !1)
; CHECK-NEXT: !10 = distinct !DIGlobalVariableExpression(var: !11)
; CHECK-NEXT: !11 = !DIGlobalVariable(name: "G",{{.*}} type: !1,
; CHECK-NEXT: !12 = !DITemplateValueParameter(type: !1, value: i32* @G1)
; CHECK-NEXT: !13 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "T2", scope: !0, entity: !1)
; CHECK-NEXT: !14 = !DICompositeType(tag: DW_TAG_structure_type, name: "T3", file: !0, elements: !15, identifier: "T3")
; CHECK-NEXT: !15 = !{!16}
; CHECK-NEXT: !16 = !DISubprogram(scope: !14,
; CHECK-NEXT: !17 = !DIDerivedType(tag: DW_TAG_ptr_to_member_type,{{.*}} extraData: !14)
; CHECK-NEXT: !10 = !DIGlobalVariable(name: "G",{{.*}} type: !1,
; CHECK-NEXT: !11 = !DITemplateValueParameter(type: !1, value: i32* @G1)
; CHECK-NEXT: !12 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "T2", scope: !0, entity: !1)
; CHECK-NEXT: !13 = !DICompositeType(tag: DW_TAG_structure_type, name: "T3", file: !0, elements: !14, identifier: "T3")
; CHECK-NEXT: !14 = !{!15}
; CHECK-NEXT: !15 = !DISubprogram(scope: !13,
; CHECK-NEXT: !16 = !DIDerivedType(tag: DW_TAG_ptr_to_member_type,{{.*}} extraData: !13)
!0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir")
!1 = !DICompositeType(tag: DW_TAG_structure_type, name: "T1", file: !0, identifier: "T1")

View File

@ -0,0 +1,74 @@
; RUN: llc -mtriple=aarch64 -mcpu=cortex-a53 < %s | FileCheck %s
; Tests to check that zero stores which are generated as STP xzr, xzr aren't
; scheduled incorrectly due to incorrect alias information
declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i32, i1)
%struct.tree_common = type { i8*, i8*, i32 }
; Original test case which exhibited the bug
define void @test1(%struct.tree_common* %t, i32 %code, i8* %type) {
; CHECK-LABEL: test1:
; CHECK: stp xzr, xzr, [x0, #8]
; CHECK: stp xzr, x2, [x0]
; CHECK: str w1, [x0, #16]
entry:
%0 = bitcast %struct.tree_common* %t to i8*
tail call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 24, i32 8, i1 false)
%code1 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 2
store i32 %code, i32* %code1, align 8
%type2 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 1
store i8* %type, i8** %type2, align 8
ret void
}
; Store to each struct element instead of using memset
define void @test2(%struct.tree_common* %t, i32 %code, i8* %type) {
; CHECK-LABEL: test2:
; CHECK: stp xzr, xzr, [x0]
; CHECK: str wzr, [x0, #16]
; CHECK: str w1, [x0, #16]
; CHECK: str x2, [x0, #8]
entry:
%0 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 0
%1 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 1
%2 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 2
store i8* zeroinitializer, i8** %0, align 8
store i8* zeroinitializer, i8** %1, align 8
store i32 zeroinitializer, i32* %2, align 8
store i32 %code, i32* %2, align 8
store i8* %type, i8** %1, align 8
ret void
}
; Vector store instead of memset
define void @test3(%struct.tree_common* %t, i32 %code, i8* %type) {
; CHECK-LABEL: test3:
; CHECK: stp xzr, xzr, [x0, #8]
; CHECK: stp xzr, x2, [x0]
; CHECK: str w1, [x0, #16]
entry:
%0 = bitcast %struct.tree_common* %t to <3 x i64>*
store <3 x i64> zeroinitializer, <3 x i64>* %0, align 8
%code1 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 2
store i32 %code, i32* %code1, align 8
%type2 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 1
store i8* %type, i8** %type2, align 8
ret void
}
; Vector store, then store to vector elements
define void @test4(<3 x i64>* %p, i64 %x, i64 %y) {
; CHECK-LABEL: test4:
; CHECK: stp xzr, xzr, [x0, #8]
; CHECK: stp xzr, x2, [x0]
; CHECK: str x1, [x0, #16]
entry:
store <3 x i64> zeroinitializer, <3 x i64>* %p, align 8
%0 = bitcast <3 x i64>* %p to i64*
%1 = getelementptr inbounds i64, i64* %0, i64 2
store i64 %x, i64* %1, align 8
%2 = getelementptr inbounds i64, i64* %0, i64 1
store i64 %y, i64* %2, align 8
ret void
}

View File

@ -0,0 +1,57 @@
; REQUIRES: asserts
; RUN: llc < %s -mtriple=aarch64 -mcpu=cyclone -mattr=+use-aa -enable-misched -verify-misched -debug-only=misched -o - 2>&1 > /dev/null | FileCheck %s
; Tests to check that the scheduler dependencies derived from alias analysis are
; correct when we have loads that have been split up so that they can later be
; merged into STP.
; CHECK: ********** MI Scheduling **********
; CHECK: test_splat:BB#0 entry
; CHECK: SU({{[0-9]+}}): STRWui %vreg{{[0-9]+}}, %vreg{{[0-9]+}}, 3; mem:ST4[%3+8]
; CHECK: Successors:
; CHECK-NEXT: ord [[SU1:SU\([0-9]+\)]]
; CHECK: SU({{[0-9]+}}): STRWui %vreg{{[0-9]+}}, %vreg{{[0-9]+}}, 2; mem:ST4[%3+4]
; CHECK: Successors:
; CHECK-NEXT: ord [[SU2:SU\([0-9]+\)]]
; CHECK: [[SU1]]: STRWui %vreg{{[0-9]+}}, %vreg{{[0-9]+}}, 3; mem:ST4[%2]
; CHECK: [[SU2]]: STRWui %vreg{{[0-9]+}}, %vreg{{[0-9]+}}, 2; mem:ST4[%1]
define void @test_splat(i32 %x, i32 %y, i32* %p) {
entry:
%val = load i32, i32* %p, align 4
%0 = getelementptr inbounds i32, i32* %p, i64 1
%1 = getelementptr inbounds i32, i32* %p, i64 2
%2 = getelementptr inbounds i32, i32* %p, i64 3
%vec0 = insertelement <4 x i32> undef, i32 %val, i32 0
%vec1 = insertelement <4 x i32> %vec0, i32 %val, i32 1
%vec2 = insertelement <4 x i32> %vec1, i32 %val, i32 2
%vec3 = insertelement <4 x i32> %vec2, i32 %val, i32 3
%3 = bitcast i32* %0 to <4 x i32>*
store <4 x i32> %vec3, <4 x i32>* %3, align 4
store i32 %x, i32* %2, align 4
store i32 %y, i32* %1, align 4
ret void
}
declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i32, i1)
%struct.tree_common = type { i8*, i8*, i32 }
; CHECK: ********** MI Scheduling **********
; CHECK: test_zero:BB#0 entry
; CHECK: SU({{[0-9]+}}): STRXui %XZR, %vreg{{[0-9]+}}, 2; mem:ST8[%0+16]
; CHECK: Successors:
; CHECK-NEXT: ord [[SU3:SU\([0-9]+\)]]
; CHECK: SU({{[0-9]+}}): STRXui %XZR, %vreg{{[0-9]+}}, 1; mem:ST8[%0+8]
; CHECK: Successors:
; CHECK-NEXT: ord [[SU4:SU\([0-9]+\)]]
; CHECK: [[SU3]]: STRWui %vreg{{[0-9]+}}, %vreg{{[0-9]+}}, 4; mem:ST4[%code1]
; CHECK: [[SU4]]: STRXui %vreg{{[0-9]+}}, %vreg{{[0-9]+}}, 1; mem:ST8[%type2]
define void @test_zero(%struct.tree_common* %t, i32 %code, i8* %type) {
entry:
%0 = bitcast %struct.tree_common* %t to i8*
tail call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 24, i32 8, i1 false)
%code1 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 2
store i32 %code, i32* %code1, align 8
%type2 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 1
store i8* %type, i8** %type2, align 8
ret void
}

View File

@ -1,19 +1,24 @@
# RUN: llc -mtriple=aarch64-apple-ios -run-pass=simple-register-coalescing %s -o - | FileCheck %s
--- |
define void @func() { ret void }
define void @func0() { ret void }
define void @func1() { ret void }
define void @func2() { ret void }
...
---
# Check coalescing of COPYs from reserved physregs.
# CHECK-LABEL: name: func
name: func
# CHECK-LABEL: name: func0
name: func0
registers:
- { id: 0, class: gpr32 }
- { id: 1, class: gpr64 }
- { id: 2, class: gpr64 }
- { id: 3, class: gpr32 }
- { id: 4, class: gpr64 }
- { id: 5, class: gpr32 }
- { id: 6, class: xseqpairsclass }
- { id: 0, class: gpr32 }
- { id: 1, class: gpr64 }
- { id: 2, class: gpr64 }
- { id: 3, class: gpr32 }
- { id: 4, class: gpr64 }
- { id: 5, class: gpr32 }
- { id: 6, class: xseqpairsclass }
- { id: 7, class: gpr64 }
- { id: 8, class: gpr64sp }
- { id: 9, class: gpr64sp }
body: |
bb.0:
; We usually should not coalesce copies from allocatable physregs.
@ -60,8 +65,74 @@ body: |
; Only coalesce when the source register is reserved as a whole (this is
; a limitation of the current code which cannot update liveness information
; of the non-reserved part).
; CHECK: %6 = COPY %xzr_x0
; CHECK: %6 = COPY %x28_fp
; CHECK: HINT 0, implicit %6
%6 = COPY %xzr_x0
%6 = COPY %x28_fp
HINT 0, implicit %6
; This can be coalesced.
; CHECK: %fp = SUBXri %fp, 4, 0
%8 = SUBXri %fp, 4, 0
%fp = COPY %8
; Cannot coalesce when there are reads of the physreg.
; CHECK-NOT: %fp = SUBXri %fp, 8, 0
; CHECK: %9 = SUBXri %fp, 8, 0
; CHECK: STRXui %fp, %fp, 0
; CHECK: %fp = COPY %9
%9 = SUBXri %fp, 8, 0
STRXui %fp, %fp, 0
%fp = COPY %9
...
---
# Check coalescing of COPYs from reserved physregs.
# CHECK-LABEL: name: func1
name: func1
registers:
- { id: 0, class: gpr64sp }
body: |
bb.0:
successors: %bb.1, %bb.2
; Cannot coalesce physreg because we have reads on other CFG paths (we
; currently abort for any control flow)
; CHECK-NOT: %fp = SUBXri
; CHECK: %0 = SUBXri %fp, 12, 0
; CHECK: CBZX undef %x0, %bb.1
; CHECK: B %bb.2
%0 = SUBXri %fp, 12, 0
CBZX undef %x0, %bb.1
B %bb.2
bb.1:
%fp = COPY %0
RET_ReallyLR
bb.2:
STRXui %fp, %fp, 0
RET_ReallyLR
...
---
# CHECK-LABEL: name: func2
name: func2
registers:
- { id: 0, class: gpr64sp }
body: |
bb.0:
successors: %bb.1, %bb.2
; We can coalesce copies from physreg to vreg across multiple blocks.
; CHECK-NOT: COPY
; CHECK: CBZX undef %x0, %bb.1
; CHECK-NEXT: B %bb.2
%0 = COPY %fp
CBZX undef %x0, %bb.1
B %bb.2
bb.1:
; CHECK: STRXui undef %x0, %fp, 0
; CHECK-NEXT: RET_ReallyLR
STRXui undef %x0, %0, 0
RET_ReallyLR
bb.2:
RET_ReallyLR
...

View File

@ -0,0 +1,22 @@
# RUN: llc -o - %s -mtriple=armv7s-- -run-pass=machine-cp | FileCheck %s
---
# Test that machine copy prop recognizes the implicit-def operands on a COPY
# as clobbering the register.
# CHECK-LABEL: name: func
# CHECK: %d2 = VMOVv2i32 2, 14, _
# CHECK: %s5 = COPY %s0, implicit %q1, implicit-def %q1
# CHECK: VST1q32 %r0, 0, %q1, 14, _
# The following two COPYs must not be removed
# CHECK: %s4 = COPY %s20, implicit-def %q1
# CHECK: %s5 = COPY %s0, implicit killed %d0, implicit %q1, implicit-def %q1
# CHECK: VST1q32 %r2, 0, %q1, 14, _
name: func
body: |
bb.0:
%d2 = VMOVv2i32 2, 14, _
%s5 = COPY %s0, implicit %q1, implicit-def %q1
VST1q32 %r0, 0, %q1, 14, _
%s4 = COPY %s20, implicit-def %q1
%s5 = COPY %s0, implicit killed %d0, implicit %q1, implicit-def %q1
VST1q32 %r2, 0, %q1, 14, _
...

View File

@ -1,53 +0,0 @@
; RUN: llc < %s -mtriple=i686-linux -show-mc-encoding | FileCheck %s
; RUN: llc < %s -mtriple=x86_64-linux -show-mc-encoding | FileCheck %s
declare void @foo()
declare void @bar()
define void @f(i32 %x, i32 %y) optsize {
entry:
%p = icmp eq i32 %x, %y
br i1 %p, label %bb1, label %bb2
bb1:
tail call void @foo()
ret void
bb2:
tail call void @bar()
ret void
; CHECK-LABEL: f:
; CHECK: cmp
; CHECK: jne bar
; Check that the asm doesn't just look good, but uses the correct encoding.
; CHECK: encoding: [0x75,A]
; CHECK: jmp foo
}
declare x86_thiscallcc zeroext i1 @baz(i8*, i32)
define x86_thiscallcc zeroext i1 @BlockPlacementTest(i8* %this, i32 %x) optsize {
entry:
%and = and i32 %x, 42
%tobool = icmp eq i32 %and, 0
br i1 %tobool, label %land.end, label %land.rhs
land.rhs:
%and6 = and i32 %x, 44
%tobool7 = icmp eq i32 %and6, 0
br i1 %tobool7, label %lor.rhs, label %land.end
lor.rhs:
%call = tail call x86_thiscallcc zeroext i1 @baz(i8* %this, i32 %x) #2
br label %land.end
land.end:
%0 = phi i1 [ false, %entry ], [ true, %land.rhs ], [ %call, %lor.rhs ]
ret i1 %0
; Make sure machine block placement isn't confused by the conditional tail call,
; but sees that it can fall through to the next block.
; CHECK-LABEL: BlockPlacementTest
; CHECK: je baz
; CHECK-NOT: xor
; CHECK: ret
}

View File

@ -93,7 +93,8 @@ if.end:
; CHECK-LABEL: test2_1:
; CHECK: movzbl
; CHECK: cmpl $256
; CHECK: je bar
; CHECK: jne .LBB
; CHECK: jmp bar
define void @test2_1(i32 %X) nounwind minsize {
entry:
%and = and i32 %X, 255
@ -223,7 +224,8 @@ if.end:
; CHECK-LABEL: test_sext_i8_icmp_255:
; CHECK: movb $1,
; CHECK: testb
; CHECK: je bar
; CHECK: jne .LBB
; CHECK: jmp bar
define void @test_sext_i8_icmp_255(i8 %x) nounwind minsize {
entry:
%sext = sext i8 %x to i32

View File

@ -1,85 +0,0 @@
# RUN: llc -mtriple x86_64-- -verify-machineinstrs -run-pass branch-folder -o - %s | FileCheck %s
# Check the TCRETURNdi64cc optimization.
--- |
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
define i64 @test(i64 %arg, i8* %arg1) optsize {
%tmp = icmp ult i64 %arg, 100
br i1 %tmp, label %1, label %4
%tmp3 = icmp ult i64 %arg, 10
br i1 %tmp3, label %2, label %3
%tmp5 = tail call i64 @f1(i8* %arg1, i64 %arg)
ret i64 %tmp5
%tmp7 = tail call i64 @f2(i8* %arg1, i64 %arg)
ret i64 %tmp7
ret i64 123
}
declare i64 @f1(i8*, i64)
declare i64 @f2(i8*, i64)
...
---
name: test
tracksRegLiveness: true
liveins:
- { reg: '%rdi' }
- { reg: '%rsi' }
body: |
bb.0:
successors: %bb.1, %bb.4
liveins: %rdi, %rsi
%rax = COPY %rdi
CMP64ri8 %rax, 99, implicit-def %eflags
JA_1 %bb.4, implicit %eflags
JMP_1 %bb.1
; CHECK: bb.1:
; CHECK-NEXT: successors: %bb.2({{[^)]+}}){{$}}
; CHECK-NEXT: liveins: %rax, %rsi
; CHECK-NEXT: {{^ $}}
; CHECK-NEXT: %rdi = COPY %rsi
; CHECK-NEXT: %rsi = COPY %rax
; CHECK-NEXT: CMP64ri8 %rax, 9, implicit-def %eflags
; CHECK-NEXT: TCRETURNdi64cc @f1, 0, 3, csr_64, implicit %rsp, implicit %eflags, implicit %rsp, implicit %rdi, implicit %rsi
bb.1:
successors: %bb.2, %bb.3
liveins: %rax, %rsi
CMP64ri8 %rax, 9, implicit-def %eflags
JA_1 %bb.3, implicit %eflags
JMP_1 %bb.2
bb.2:
liveins: %rax, %rsi
%rdi = COPY %rsi
%rsi = COPY %rax
TCRETURNdi64 @f1, 0, csr_64, implicit %rsp, implicit %rdi, implicit %rsi
; CHECK: bb.2:
; CHECK-NEXT: liveins: %rax, %rdi, %rsi
; CHECK-NEXT: {{^ $}}
; CHECK-NEXT: TCRETURNdi64 @f2, 0, csr_64, implicit %rsp, implicit %rdi, implicit %rsi
bb.3:
liveins: %rax, %rsi
%rdi = COPY %rsi
%rsi = COPY %rax
TCRETURNdi64 @f2, 0, csr_64, implicit %rsp, implicit %rdi, implicit %rsi
bb.4:
dead %eax = MOV32ri64 123, implicit-def %rax
RET 0, %rax
...

View File

@ -37,6 +37,19 @@
# CHECK-NEXT: ISectStart: 0x0
# CHECK-NEXT: Range: 0x1
# CHECK-NEXT: }
# CHECK-NEXT: }
# CHECK-NEXT: DefRangeRegister {
# CHECK-NEXT: Register: 23
# CHECK-NEXT: MayHaveNoName: 0
# CHECK-NEXT: LocalVariableAddrRange {
# CHECK-NEXT: OffsetStart: .text+0x2001C
# CHECK-NEXT: ISectStart: 0x0
# CHECK-NEXT: Range: 0xF000
# CHECK-NEXT: }
# CHECK-NEXT: LocalVariableAddrGap [
# CHECK-NEXT: GapStartOffset: 0x1
# CHECK-NEXT: Range: 0xEFFE
# CHECK-NEXT: ]
# CHECK-NEXT: }
.text
@ -62,6 +75,16 @@ f: # @f
.Lbegin3:
nop
.Lend3:
# Create a range that is exactly 0xF000 bytes long with a gap in the
# middle.
.Lbegin4:
nop
.Lend4:
.fill 0xeffe, 1, 0x90
.Lbegin5:
nop
.Lend5:
ret
.Lfunc_end0:
@ -94,6 +117,7 @@ f: # @f
.asciz "p"
.Ltmp19:
.cv_def_range .Lbegin0 .Lend0 .Lbegin1 .Lend1 .Lbegin2 .Lend2 .Lbegin3 .Lend3, "A\021\027\000\000\000"
.cv_def_range .Lbegin4 .Lend4 .Lbegin5 .Lend5, "A\021\027\000\000\000"
.short 2 # Record length
.short 4431 # Record kind: S_PROC_ID_END
.Ltmp15:

View File

@ -131,4 +131,44 @@ TEST(IListIteratorTest, CheckEraseReverse) {
EXPECT_EQ(L.rend(), RI);
}
TEST(IListIteratorTest, ReverseConstructor) {
simple_ilist<Node> L;
const simple_ilist<Node> &CL = L;
Node A, B;
L.insert(L.end(), A);
L.insert(L.end(), B);
// Save typing.
typedef simple_ilist<Node>::iterator iterator;
typedef simple_ilist<Node>::reverse_iterator reverse_iterator;
typedef simple_ilist<Node>::const_iterator const_iterator;
typedef simple_ilist<Node>::const_reverse_iterator const_reverse_iterator;
// Check conversion values.
EXPECT_EQ(L.begin(), iterator(L.rend()));
EXPECT_EQ(++L.begin(), iterator(++L.rbegin()));
EXPECT_EQ(L.end(), iterator(L.rbegin()));
EXPECT_EQ(L.rbegin(), reverse_iterator(L.end()));
EXPECT_EQ(++L.rbegin(), reverse_iterator(++L.begin()));
EXPECT_EQ(L.rend(), reverse_iterator(L.begin()));
// Check const iterator constructors.
EXPECT_EQ(CL.begin(), const_iterator(L.rend()));
EXPECT_EQ(CL.begin(), const_iterator(CL.rend()));
EXPECT_EQ(CL.rbegin(), const_reverse_iterator(L.end()));
EXPECT_EQ(CL.rbegin(), const_reverse_iterator(CL.end()));
// Confirm lack of implicit conversions.
static_assert(!std::is_convertible<iterator, reverse_iterator>::value,
"unexpected implicit conversion");
static_assert(!std::is_convertible<reverse_iterator, iterator>::value,
"unexpected implicit conversion");
static_assert(
!std::is_convertible<const_iterator, const_reverse_iterator>::value,
"unexpected implicit conversion");
static_assert(
!std::is_convertible<const_reverse_iterator, const_iterator>::value,
"unexpected implicit conversion");
}
} // end namespace

View File

@ -130,4 +130,68 @@ TEST(MachineInstrBundleIteratorTest, CompareToBundledMI) {
ASSERT_TRUE(CI != CMBI.getIterator());
}
struct MyUnbundledInstr
: ilist_node<MyUnbundledInstr, ilist_sentinel_tracking<true>> {
bool isBundledWithPred() const { return false; }
bool isBundledWithSucc() const { return false; }
};
typedef MachineInstrBundleIterator<MyUnbundledInstr> unbundled_iterator;
typedef MachineInstrBundleIterator<const MyUnbundledInstr>
const_unbundled_iterator;
typedef MachineInstrBundleIterator<MyUnbundledInstr, true>
reverse_unbundled_iterator;
typedef MachineInstrBundleIterator<const MyUnbundledInstr, true>
const_reverse_unbundled_iterator;
TEST(MachineInstrBundleIteratorTest, ReverseConstructor) {
simple_ilist<MyUnbundledInstr, ilist_sentinel_tracking<true>> L;
const auto &CL = L;
MyUnbundledInstr A, B;
L.insert(L.end(), A);
L.insert(L.end(), B);
// Save typing.
typedef MachineInstrBundleIterator<MyUnbundledInstr> iterator;
typedef MachineInstrBundleIterator<MyUnbundledInstr, true> reverse_iterator;
typedef MachineInstrBundleIterator<const MyUnbundledInstr> const_iterator;
typedef MachineInstrBundleIterator<const MyUnbundledInstr, true>
const_reverse_iterator;
// Convert to bundle iterators.
auto begin = [&]() -> iterator { return L.begin(); };
auto end = [&]() -> iterator { return L.end(); };
auto rbegin = [&]() -> reverse_iterator { return L.rbegin(); };
auto rend = [&]() -> reverse_iterator { return L.rend(); };
auto cbegin = [&]() -> const_iterator { return CL.begin(); };
auto cend = [&]() -> const_iterator { return CL.end(); };
auto crbegin = [&]() -> const_reverse_iterator { return CL.rbegin(); };
auto crend = [&]() -> const_reverse_iterator { return CL.rend(); };
// Check conversion values.
EXPECT_EQ(begin(), iterator(rend()));
EXPECT_EQ(++begin(), iterator(++rbegin()));
EXPECT_EQ(end(), iterator(rbegin()));
EXPECT_EQ(rbegin(), reverse_iterator(end()));
EXPECT_EQ(++rbegin(), reverse_iterator(++begin()));
EXPECT_EQ(rend(), reverse_iterator(begin()));
// Check const iterator constructors.
EXPECT_EQ(cbegin(), const_iterator(rend()));
EXPECT_EQ(cbegin(), const_iterator(crend()));
EXPECT_EQ(crbegin(), const_reverse_iterator(end()));
EXPECT_EQ(crbegin(), const_reverse_iterator(cend()));
// Confirm lack of implicit conversions.
static_assert(!std::is_convertible<iterator, reverse_iterator>::value,
"unexpected implicit conversion");
static_assert(!std::is_convertible<reverse_iterator, iterator>::value,
"unexpected implicit conversion");
static_assert(
!std::is_convertible<const_iterator, const_reverse_iterator>::value,
"unexpected implicit conversion");
static_assert(
!std::is_convertible<const_reverse_iterator, const_iterator>::value,
"unexpected implicit conversion");
}
} // end namespace

View File

@ -47,7 +47,6 @@ svn.exe export -r %revision% http://llvm.org/svn/llvm-project/clang-tools-extra/
svn.exe export -r %revision% http://llvm.org/svn/llvm-project/lld/%branch% llvm/tools/lld || exit /b
svn.exe export -r %revision% http://llvm.org/svn/llvm-project/compiler-rt/%branch% llvm/projects/compiler-rt || exit /b
svn.exe export -r %revision% http://llvm.org/svn/llvm-project/openmp/%branch% llvm/projects/openmp || exit /b
svn.exe export -r %revision% http://llvm.org/svn/llvm-project/lldb/%branch% llvm/tools/lldb || exit /b
REM Setting CMAKE_CL_SHOWINCLUDES_PREFIX to work around PR27226.