Vendor import of llvm release_40 branch r294803:

https://llvm.org/svn/llvm-project/llvm/branches/release_40@294803
2017-02-11 13:25:24 +00:00 · 2017-02-11 13:25:24 +00:00 · 43b21f1541
commit 43b21f1541
parent fd782386ad
31 changed files with 662 additions and 416 deletions
--- a/docs/ReleaseNotes.rst
+++ b/docs/ReleaseNotes.rst
@ -55,6 +55,12 @@ Non-comprehensive list of changes in this release
 * LLVM now handles invariant.group across different basic blocks, which makes
  it possible to devirtualize virtual calls inside loops.

+* The aggressive dead code elimination phase ("adce") now remove
+  branches which do not effect program behavior. Loops are retained by
+  default since they may be infinite but these can also be removed
+  with LLVM option -adce-remove-loops when the loop body otherwise has
+  no live operations.
+
 * ... next change ...

 .. NOTE
@ -75,6 +81,95 @@ Non-comprehensive list of changes in this release
   * Significant build-time and binary-size improvements when compiling with
     debug info (-g).

+Code Generation Testing
+-----------------------
+
+Passes that work on the machine instruction representation can be tested with
+the .mir serialization format. ``llc`` supports the ``-run-pass``,
+``-stop-after``, ``-stop-before``, ``-start-after``, ``-start-before`` to to
+run a single pass of the code generation pipeline, or to stop or start the code
+generation pipeline at a given point.
+
+Additional information can be found in the :doc:`MIRLangRef`. The format is
+used by the tests ending in ``.mir`` in the ``test/CodeGen`` directory.
+
+This feature is available since 2015. It is used more often lately and was not
+mentioned in the release notes yet.
+
+Intrusive list API overhaul
+---------------------------
+
+The intrusive list infrastructure was substantially rewritten over the last
+couple of releases, primarily to excise undefined behaviour.  The biggest
+changes landed in this release.
+
+* ``simple_ilist<T>`` is a lower-level intrusive list that never takes
+  ownership of its nodes.  New intrusive-list clients should consider using it
+  instead of ``ilist<T>``.
+
+  * ``ilist_tag<class>`` allows a single data type to be inserted into two
+    parallel intrusive lists.  A type can inherit twice from ``ilist_node``,
+    first using ``ilist_node<T,ilist_tag<A>>`` (enabling insertion into
+    ``simple_ilist<T,ilist_tag<A>>``) and second using
+    ``ilist_node<T,ilist_tag<B>>`` (enabling insertion into
+    ``simple_ilist<T,ilist_tag<B>>``), where ``A`` and ``B`` are arbitrary
+    types.
+
+  * ``ilist_sentinel_tracking<bool>`` controls whether an iterator knows
+    whether it's pointing at the sentinel (``end()``).  By default, sentinel
+    tracking is on when ABI-breaking checks are enabled, and off otherwise;
+    this is used for an assertion when dereferencing ``end()`` (this assertion
+    triggered often in practice, and many backend bugs were fixed).  Explicitly
+    turning on sentinel tracking also enables ``iterator::isEnd()``.  This is
+    used by ``MachineInstrBundleIterator`` to iterate over bundles.
+
+* ``ilist<T>`` is built on top of ``simple_ilist<T>``, and supports the same
+  configuration options.  As before (and unlike ``simple_ilist<T>``),
+  ``ilist<T>`` takes ownership of its nodes.  However, it no longer supports
+  *allocating* nodes, and is now equivalent to ``iplist<T>``.  ``iplist<T>``
+  will likely be removed in the future.
+
+  * ``ilist<T>`` now always uses ``ilist_traits<T>``.  Instead of passing a
+    custom traits class in via a template parameter, clients that want to
+    customize the traits should specialize ``ilist_traits<T>``.  Clients that
+    want to avoid ownership can specialize ``ilist_alloc_traits<T>`` to inherit
+    from ``ilist_noalloc_traits<T>`` (or to do something funky); clients that
+    need callbacks can specialize ``ilist_callback_traits<T>`` directly.
+
+* The underlying data structure is now a simple recursive linked list.  The
+  sentinel node contains only a "next" (``begin()``) and "prev" (``rbegin()``)
+  pointer and is stored in the same allocation as ``simple_ilist<T>``.
+  Previously, it was malloc-allocated on-demand by default, although the
+  now-defunct ``ilist_sentinel_traits<T>`` was sometimes specialized to avoid
+  this.
+
+* The ``reverse_iterator`` class no longer uses ``std::reverse_iterator``.
+  Instead, it now has a handle to the same node that it dereferences to.
+  Reverse iterators now have the same iterator invalidation semantics as
+  forward iterators.
+
+  * ``iterator`` and ``reverse_iterator`` have explicit conversion constructors
+    that match ``std::reverse_iterator``'s off-by-one semantics, so that
+    reversing the end points of an iterator range results in the same range
+    (albeit in reverse).  I.e., ``reverse_iterator(begin())`` equals
+    ``rend()``.
+
+  * ``iterator::getReverse()`` and ``reverse_iterator::getReverse()`` return an
+    iterator that dereferences to the *same* node.  I.e.,
+    ``begin().getReverse()`` equals ``--rend()``.
+
+  * ``ilist_node<T>::getIterator()`` and
+    ``ilist_node<T>::getReverseIterator()`` return the forward and reverse
+    iterators that dereference to the current node.  I.e.,
+    ``begin()->getIterator()`` equals ``begin()`` and
+    ``rbegin()->getReverseIterator()`` equals ``rbegin()``.
+
+* ``iterator`` now stores an ``ilist_node_base*`` instead of a ``T*``.  The
+  implicit conversions between ``ilist<T>::iterator`` and ``T*`` have been
+  removed.  Clients may use ``N->getIterator()`` (if not ``nullptr``) or
+  ``&*I`` (if not ``end()``); alternatively, clients may refactor to use
+  references for known-good nodes.
+
 Changes to the LLVM IR
 ----------------------

@ -133,9 +228,23 @@ Changes to the AMDGPU Target
 Changes to the AVR Target
 -----------------------------

-* The entire backend has been merged in-tree with all tests passing. All of
-  the instruction selection code and the machine code backend has landed
-  recently and is fully usable.
+This marks the first release where the AVR backend has been completely merged
+from a fork into LLVM trunk. The backend is still marked experimental, but
+is generally quite usable. All downstream development has halted on
+`GitHub <https://github.com/avr-llvm/llvm>`_, and changes now go directly into
+LLVM trunk.
+
+* Instruction selector and pseudo instruction expansion pass landed
+* `read_register` and `write_register` intrinsics are now supported
+* Support stack stores greater than 63-bytes from the bottom of the stack
+* A number of assertion errors have been fixed
+* Support stores to `undef` locations
+* Very basic support for the target has been added to clang
+* Small optimizations to some 16-bit boolean expressions
+
+Most of the work behind the scenes has been on correctness of generated
+assembly, and also fixing some assertions we would hit on some well-formed
+inputs.

 Changes to the OCaml bindings
 -----------------------------
--- a/docs/conf.py
+++ b/docs/conf.py
@ -48,9 +48,9 @@ copyright = u'2003-%d, LLVM Project' % date.today().year
 # built documents.
 #
 # The short X.Y version.
-version = '4.0'
+version = '4'
 # The full version, including alpha/beta/rc tags.
-release = '4.0'
+release = '4'

 # The language for content autogenerated by Sphinx. Refer to documentation
 # for a list of supported languages.
--- a/include/llvm/ADT/ilist_iterator.h
+++ b/include/llvm/ADT/ilist_iterator.h
@ -102,10 +102,23 @@ public:
    return *this;
  }

-  /// Convert from an iterator to its reverse.
+  /// Explicit conversion between forward/reverse iterators.
  ///
-  /// TODO: Roll this into the implicit constructor once we're sure that no one
-  /// is relying on the std::reverse_iterator off-by-one semantics.
+  /// Translate between forward and reverse iterators without changing range
+  /// boundaries.  The resulting iterator will dereference (and have a handle)
+  /// to the previous node, which is somewhat unexpected; but converting the
+  /// two endpoints in a range will give the same range in reverse.
+  ///
+  /// This matches std::reverse_iterator conversions.
+  explicit ilist_iterator(
+      const ilist_iterator<OptionsT, !IsReverse, IsConst> &RHS)
+      : ilist_iterator(++RHS.getReverse()) {}
+
+  /// Get a reverse iterator to the same node.
+  ///
+  /// Gives a reverse iterator that will dereference (and have a handle) to the
+  /// same node.  Converting the endpoint iterators in a range will give a
+  /// different range; for range operations, use the explicit conversions.
  ilist_iterator<OptionsT, !IsReverse, IsConst> getReverse() const {
    if (NodePtr)
      return ilist_iterator<OptionsT, !IsReverse, IsConst>(*NodePtr);
--- a/include/llvm/CodeGen/MachineInstrBundleIterator.h
+++ b/include/llvm/CodeGen/MachineInstrBundleIterator.h
@ -153,6 +153,18 @@ public:
      : MII(I.getInstrIterator()) {}
  MachineInstrBundleIterator() : MII(nullptr) {}

+  /// Explicit conversion between forward/reverse iterators.
+  ///
+  /// Translate between forward and reverse iterators without changing range
+  /// boundaries.  The resulting iterator will dereference (and have a handle)
+  /// to the previous node, which is somewhat unexpected; but converting the
+  /// two endpoints in a range will give the same range in reverse.
+  ///
+  /// This matches std::reverse_iterator conversions.
+  explicit MachineInstrBundleIterator(
+      const MachineInstrBundleIterator<Ty, !IsReverse> &I)
+      : MachineInstrBundleIterator(++I.getReverse()) {}
+
  /// Get the bundle iterator for the given instruction's bundle.
  static MachineInstrBundleIterator getAtBundleBegin(instr_iterator MI) {
    return MachineInstrBundleIteratorHelper<IsReverse>::getBundleBegin(MI);
@ -258,6 +270,11 @@ public:

  nonconst_iterator getNonConstIterator() const { return MII.getNonConst(); }

+  /// Get a reverse iterator to the same node.
+  ///
+  /// Gives a reverse iterator that will dereference (and have a handle) to the
+  /// same node.  Converting the endpoint iterators in a range will give a
+  /// different range; for range operations, use the explicit conversions.
  reverse_iterator getReverse() const { return MII.getReverse(); }
 };

--- a/include/llvm/IR/PassManager.h
+++ b/include/llvm/IR/PassManager.h
@ -311,6 +311,8 @@ template <typename IRUnitT, typename... ExtraArgTs> class AnalysisManager;
 template <typename DerivedT> struct PassInfoMixin {
  /// Gets the name of the pass we are mixed into.
  static StringRef name() {
+    static_assert(std::is_base_of<PassInfoMixin, DerivedT>::value,
+                  "Must pass the derived type as the template argument!");
    StringRef Name = getTypeName<DerivedT>();
    if (Name.startswith("llvm::"))
      Name = Name.drop_front(strlen("llvm::"));
@ -339,7 +341,11 @@ struct AnalysisInfoMixin : PassInfoMixin<DerivedT> {
  /// known platform with this limitation is Windows DLL builds, specifically
  /// building each part of LLVM as a DLL. If we ever remove that build
  /// configuration, this mixin can provide the static key as well.
-  static AnalysisKey *ID() { return &DerivedT::Key; }
+  static AnalysisKey *ID() {
+    static_assert(std::is_base_of<AnalysisInfoMixin, DerivedT>::value,
+                  "Must pass the derived type as the template argument!");
+    return &DerivedT::Key;
+  }
 };

 /// This templated class represents "all analyses that operate over \<a
@ -1010,7 +1016,7 @@ extern template class InnerAnalysisManagerProxy<FunctionAnalysisManager,
 template <typename AnalysisManagerT, typename IRUnitT, typename... ExtraArgTs>
 class OuterAnalysisManagerProxy
    : public AnalysisInfoMixin<
-          OuterAnalysisManagerProxy<AnalysisManagerT, IRUnitT>> {
+          OuterAnalysisManagerProxy<AnalysisManagerT, IRUnitT, ExtraArgTs...>> {
 public:
  /// \brief Result proxy object for \c OuterAnalysisManagerProxy.
  class Result {
@ -1072,7 +1078,7 @@ public:

 private:
  friend AnalysisInfoMixin<
-      OuterAnalysisManagerProxy<AnalysisManagerT, IRUnitT>>;
+      OuterAnalysisManagerProxy<AnalysisManagerT, IRUnitT, ExtraArgTs...>>;
  static AnalysisKey Key;

  const AnalysisManagerT *AM;
--- a/include/llvm/Target/TargetInstrInfo.h
+++ b/include/llvm/Target/TargetInstrInfo.h
@ -1108,25 +1108,6 @@ public:
  /// terminator instruction that has not been predicated.
  virtual bool isUnpredicatedTerminator(const MachineInstr &MI) const;

-  /// Returns true if MI is an unconditional tail call.
-  virtual bool isUnconditionalTailCall(const MachineInstr &MI) const {
-    return false;
-  }
-
-  /// Returns true if the tail call can be made conditional on BranchCond.
-  virtual bool
-  canMakeTailCallConditional(SmallVectorImpl<MachineOperand> &Cond,
-                             const MachineInstr &TailCall) const {
-    return false;
-  }
-
-  /// Replace the conditional branch in MBB with a conditional tail call.
-  virtual void replaceBranchWithTailCall(MachineBasicBlock &MBB,
-                                         SmallVectorImpl<MachineOperand> &Cond,
-                                         const MachineInstr &TailCall) const {
-    llvm_unreachable("Target didn't implement replaceBranchWithTailCall!");
-  }
-
  /// Convert the instruction into a predicated instruction.
  /// It returns true if the operation was successful.
  virtual bool PredicateInstruction(MachineInstr &MI,
--- a/lib/Bitcode/Reader/MetadataLoader.cpp
+++ b/lib/Bitcode/Reader/MetadataLoader.cpp
@ -448,6 +448,7 @@ class MetadataLoader::MetadataLoaderImpl {

  bool StripTBAA = false;
  bool HasSeenOldLoopTags = false;
+  bool NeedUpgradeToDIGlobalVariableExpression = false;

  /// True if metadata is being parsed for a module being ThinLTO imported.
  bool IsImporting = false;
@ -473,6 +474,45 @@ class MetadataLoader::MetadataLoaderImpl {
    CUSubprograms.clear();
  }

+  /// Upgrade old-style bare DIGlobalVariables to DIGlobalVariableExpressions.
+  void upgradeCUVariables() {
+    if (!NeedUpgradeToDIGlobalVariableExpression)
+      return;
+
+    // Upgrade list of variables attached to the CUs.
+    if (NamedMDNode *CUNodes = TheModule.getNamedMetadata("llvm.dbg.cu"))
+      for (unsigned I = 0, E = CUNodes->getNumOperands(); I != E; ++I) {
+        auto *CU = cast<DICompileUnit>(CUNodes->getOperand(I));
+        if (auto *GVs = dyn_cast_or_null<MDTuple>(CU->getRawGlobalVariables()))
+          for (unsigned I = 0; I < GVs->getNumOperands(); I++)
+            if (auto *GV =
+                    dyn_cast_or_null<DIGlobalVariable>(GVs->getOperand(I))) {
+              auto *DGVE =
+                  DIGlobalVariableExpression::getDistinct(Context, GV, nullptr);
+              GVs->replaceOperandWith(I, DGVE);
+            }
+      }
+
+    // Upgrade variables attached to globals.
+    for (auto &GV : TheModule.globals()) {
+      SmallVector<MDNode *, 1> MDs, NewMDs;
+      GV.getMetadata(LLVMContext::MD_dbg, MDs);
+      GV.eraseMetadata(LLVMContext::MD_dbg);
+      for (auto *MD : MDs)
+        if (auto *DGV = dyn_cast_or_null<DIGlobalVariable>(MD)) {
+          auto *DGVE =
+              DIGlobalVariableExpression::getDistinct(Context, DGV, nullptr);
+          GV.addMetadata(LLVMContext::MD_dbg, *DGVE);
+        } else
+          GV.addMetadata(LLVMContext::MD_dbg, *MD);
+    }
+  }
+
+  void upgradeDebugInfo() {
+    upgradeCUSubprograms();
+    upgradeCUVariables();
+  }
+
 public:
  MetadataLoaderImpl(BitstreamCursor &Stream, Module &TheModule,
                     BitcodeReaderValueList &ValueList,
@ -726,7 +766,7 @@ Error MetadataLoader::MetadataLoaderImpl::parseMetadata(bool ModuleLevel) {
      // Reading the named metadata created forward references and/or
      // placeholders, that we flush here.
      resolveForwardRefsAndPlaceholders(Placeholders);
-      upgradeCUSubprograms();
+      upgradeDebugInfo();
      // Return at the beginning of the block, since it is easy to skip it
      // entirely from there.
      Stream.ReadBlockEnd(); // Pop the abbrev block context.
@ -750,7 +790,7 @@ Error MetadataLoader::MetadataLoaderImpl::parseMetadata(bool ModuleLevel) {
      return error("Malformed block");
    case BitstreamEntry::EndBlock:
      resolveForwardRefsAndPlaceholders(Placeholders);
-      upgradeCUSubprograms();
+      upgradeDebugInfo();
      return Error::success();
    case BitstreamEntry::Record:
      // The interesting case.
@ -1420,11 +1460,17 @@ Error MetadataLoader::MetadataLoaderImpl::parseOneMetadata(
           getDITypeRefOrNull(Record[6]), Record[7], Record[8],
           getMDOrNull(Record[10]), AlignInBits));

-      auto *DGVE = DIGlobalVariableExpression::getDistinct(Context, DGV, Expr);
-      MetadataList.assignValue(DGVE, NextMetadataNo);
-      NextMetadataNo++;
+      DIGlobalVariableExpression *DGVE = nullptr;
+      if (Attach || Expr)
+        DGVE = DIGlobalVariableExpression::getDistinct(Context, DGV, Expr);
+      else
+        NeedUpgradeToDIGlobalVariableExpression = true;
      if (Attach)
        Attach->addDebugInfo(DGVE);
+
+      auto *MDNode = Expr ? cast<Metadata>(DGVE) : cast<Metadata>(DGV);
+      MetadataList.assignValue(MDNode, NextMetadataNo);
+      NextMetadataNo++;
    } else
      return error("Invalid record");

--- a/lib/CodeGen/BranchFolding.cpp
+++ b/lib/CodeGen/BranchFolding.cpp
@ -49,7 +49,6 @@ STATISTIC(NumDeadBlocks, "Number of dead blocks removed");
 STATISTIC(NumBranchOpts, "Number of branches optimized");
 STATISTIC(NumTailMerge , "Number of block tails merged");
 STATISTIC(NumHoist     , "Number of times common instructions are hoisted");
-STATISTIC(NumTailCalls,  "Number of tail calls optimized");

 static cl::opt<cl::boolOrDefault> FlagEnableTailMerge("enable-tail-merge",
                              cl::init(cl::BOU_UNSET), cl::Hidden);
@ -1387,42 +1386,6 @@ ReoptimizeBlock:
    }
  }

-  if (!IsEmptyBlock(MBB) && MBB->pred_size() == 1 &&
-      MF.getFunction()->optForSize()) {
-    // Changing "Jcc foo; foo: jmp bar;" into "Jcc bar;" might change the branch
-    // direction, thereby defeating careful block placement and regressing
-    // performance. Therefore, only consider this for optsize functions.
-    MachineInstr &TailCall = *MBB->getFirstNonDebugInstr();
-    if (TII->isUnconditionalTailCall(TailCall)) {
-      MachineBasicBlock *Pred = *MBB->pred_begin();
-      MachineBasicBlock *PredTBB = nullptr, *PredFBB = nullptr;
-      SmallVector<MachineOperand, 4> PredCond;
-      bool PredAnalyzable =
-          !TII->analyzeBranch(*Pred, PredTBB, PredFBB, PredCond, true);
-
-      if (PredAnalyzable && !PredCond.empty() && PredTBB == MBB) {
-        // The predecessor has a conditional branch to this block which consists
-        // of only a tail call. Try to fold the tail call into the conditional
-        // branch.
-        if (TII->canMakeTailCallConditional(PredCond, TailCall)) {
-          // TODO: It would be nice if analyzeBranch() could provide a pointer
-          // to the branch insturction so replaceBranchWithTailCall() doesn't
-          // have to search for it.
-          TII->replaceBranchWithTailCall(*Pred, PredCond, TailCall);
-          ++NumTailCalls;
-          Pred->removeSuccessor(MBB);
-          MadeChange = true;
-          return MadeChange;
-        }
-      }
-      // If the predecessor is falling through to this block, we could reverse
-      // the branch condition and fold the tail call into that. However, after
-      // that we might have to re-arrange the CFG to fall through to the other
-      // block and there is a high risk of regressing code size rather than
-      // improving it.
-    }
-  }
-
  // Analyze the branch in the current block.
  MachineBasicBlock *CurTBB = nullptr, *CurFBB = nullptr;
  SmallVector<MachineOperand, 4> CurCond;
--- a/lib/CodeGen/MachineCopyPropagation.cpp
+++ b/lib/CodeGen/MachineCopyPropagation.cpp
@ -61,6 +61,7 @@ namespace {

  private:
    void ClobberRegister(unsigned Reg);
+    void ReadRegister(unsigned Reg);
    void CopyPropagateBlock(MachineBasicBlock &MBB);
    bool eraseIfRedundant(MachineInstr &Copy, unsigned Src, unsigned Def);

@ -120,6 +121,18 @@ void MachineCopyPropagation::ClobberRegister(unsigned Reg) {
  }
 }

+void MachineCopyPropagation::ReadRegister(unsigned Reg) {
+  // If 'Reg' is defined by a copy, the copy is no longer a candidate
+  // for elimination.
+  for (MCRegAliasIterator AI(Reg, TRI, true); AI.isValid(); ++AI) {
+    Reg2MIMap::iterator CI = CopyMap.find(*AI);
+    if (CI != CopyMap.end()) {
+      DEBUG(dbgs() << "MCP: Copy is used - not dead: "; CI->second->dump());
+      MaybeDeadCopies.remove(CI->second);
+    }
+  }
+}
+
 /// Return true if \p PreviousCopy did copy register \p Src to register \p Def.
 /// This fact may have been obscured by sub register usage or may not be true at
 /// all even though Src and Def are subregisters of the registers used in
@ -212,12 +225,14 @@ void MachineCopyPropagation::CopyPropagateBlock(MachineBasicBlock &MBB) {

      // If Src is defined by a previous copy, the previous copy cannot be
      // eliminated.
-      for (MCRegAliasIterator AI(Src, TRI, true); AI.isValid(); ++AI) {
-        Reg2MIMap::iterator CI = CopyMap.find(*AI);
-        if (CI != CopyMap.end()) {
-          DEBUG(dbgs() << "MCP: Copy is no longer dead: "; CI->second->dump());
-          MaybeDeadCopies.remove(CI->second);
-        }
+      ReadRegister(Src);
+      for (const MachineOperand &MO : MI->implicit_operands()) {
+        if (!MO.isReg() || !MO.readsReg())
+          continue;
+        unsigned Reg = MO.getReg();
+        if (!Reg)
+          continue;
+        ReadRegister(Reg);
      }

      DEBUG(dbgs() << "MCP: Copy is a deletion candidate: "; MI->dump());
@ -234,6 +249,14 @@ void MachineCopyPropagation::CopyPropagateBlock(MachineBasicBlock &MBB) {
      // ...
      // %xmm2<def> = copy %xmm9
      ClobberRegister(Def);
+      for (const MachineOperand &MO : MI->implicit_operands()) {
+        if (!MO.isReg() || !MO.isDef())
+          continue;
+        unsigned Reg = MO.getReg();
+        if (!Reg)
+          continue;
+        ClobberRegister(Reg);
+      }

      // Remember Def is defined by the copy.
      for (MCSubRegIterator SR(Def, TRI, /*IncludeSelf=*/true); SR.isValid();
@ -268,17 +291,8 @@ void MachineCopyPropagation::CopyPropagateBlock(MachineBasicBlock &MBB) {

      if (MO.isDef()) {
        Defs.push_back(Reg);
-        continue;
-      }
-
-      // If 'Reg' is defined by a copy, the copy is no longer a candidate
-      // for elimination.
-      for (MCRegAliasIterator AI(Reg, TRI, true); AI.isValid(); ++AI) {
-        Reg2MIMap::iterator CI = CopyMap.find(*AI);
-        if (CI != CopyMap.end()) {
-          DEBUG(dbgs() << "MCP: Copy is used - not dead: "; CI->second->dump());
-          MaybeDeadCopies.remove(CI->second);
-        }
+      } else {
+        ReadRegister(Reg);
      }
      // Treat undef use like defs for copy propagation but not for
      // dead copy. We would need to do a liveness check to be sure the copy
--- a/lib/CodeGen/RegisterCoalescer.cpp
+++ b/lib/CodeGen/RegisterCoalescer.cpp
@ -1556,9 +1556,10 @@ bool RegisterCoalescer::joinCopy(MachineInstr *CopyMI, bool &Again) {

 bool RegisterCoalescer::joinReservedPhysReg(CoalescerPair &CP) {
  unsigned DstReg = CP.getDstReg();
+  unsigned SrcReg = CP.getSrcReg();
  assert(CP.isPhys() && "Must be a physreg copy");
  assert(MRI->isReserved(DstReg) && "Not a reserved register");
-  LiveInterval &RHS = LIS->getInterval(CP.getSrcReg());
+  LiveInterval &RHS = LIS->getInterval(SrcReg);
  DEBUG(dbgs() << "\t\tRHS = " << RHS << '\n');

  assert(RHS.containsOneValue() && "Invalid join with reserved register");
@ -1592,17 +1593,36 @@ bool RegisterCoalescer::joinReservedPhysReg(CoalescerPair &CP) {
  // Delete the identity copy.
  MachineInstr *CopyMI;
  if (CP.isFlipped()) {
-    CopyMI = MRI->getVRegDef(RHS.reg);
+    // Physreg is copied into vreg
+    //   %vregY = COPY %X
+    //   ...  //< no other def of %X here
+    //   use %vregY
+    // =>
+    //   ...
+    //   use %X
+    CopyMI = MRI->getVRegDef(SrcReg);
  } else {
-    if (!MRI->hasOneNonDBGUse(RHS.reg)) {
+    // VReg is copied into physreg:
+    //   %vregX = def
+    //   ... //< no other def or use of %Y here
+    //   %Y = COPY %vregX
+    // =>
+    //   %Y = def
+    //   ...
+    if (!MRI->hasOneNonDBGUse(SrcReg)) {
      DEBUG(dbgs() << "\t\tMultiple vreg uses!\n");
      return false;
    }

-    MachineInstr *DestMI = MRI->getVRegDef(RHS.reg);
-    CopyMI = &*MRI->use_instr_nodbg_begin(RHS.reg);
-    const SlotIndex CopyRegIdx = LIS->getInstructionIndex(*CopyMI).getRegSlot();
-    const SlotIndex DestRegIdx = LIS->getInstructionIndex(*DestMI).getRegSlot();
+    if (!LIS->intervalIsInOneMBB(RHS)) {
+      DEBUG(dbgs() << "\t\tComplex control flow!\n");
+      return false;
+    }
+
+    MachineInstr &DestMI = *MRI->getVRegDef(SrcReg);
+    CopyMI = &*MRI->use_instr_nodbg_begin(SrcReg);
+    SlotIndex CopyRegIdx = LIS->getInstructionIndex(*CopyMI).getRegSlot();
+    SlotIndex DestRegIdx = LIS->getInstructionIndex(DestMI).getRegSlot();

    if (!MRI->isConstantPhysReg(DstReg)) {
      // We checked above that there are no interfering defs of the physical
@ -1629,8 +1649,8 @@ bool RegisterCoalescer::joinReservedPhysReg(CoalescerPair &CP) {

    // We're going to remove the copy which defines a physical reserved
    // register, so remove its valno, etc.
-    DEBUG(dbgs() << "\t\tRemoving phys reg def of " << DstReg << " at "
-          << CopyRegIdx << "\n");
+    DEBUG(dbgs() << "\t\tRemoving phys reg def of " << PrintReg(DstReg, TRI)
+          << " at " << CopyRegIdx << "\n");

    LIS->removePhysRegDefAt(DstReg, CopyRegIdx);
    // Create a new dead def at the new def location.
--- a/lib/MC/MCCodeView.cpp
+++ b/lib/MC/MCCodeView.cpp
@ -509,17 +509,17 @@ void CodeViewContext::encodeDefRange(MCAsmLayout &Layout,
      // are artificially constructing.
      size_t RecordSize = FixedSizePortion.size() +
                          sizeof(LocalVariableAddrRange) + 4 * NumGaps;
-      // Write out the recrod size.
-      support::endian::Writer<support::little>(OS).write<uint16_t>(RecordSize);
+      // Write out the record size.
+      LEWriter.write<uint16_t>(RecordSize);
      // Write out the fixed size prefix.
      OS << FixedSizePortion;
      // Make space for a fixup that will eventually have a section relative
      // relocation pointing at the offset where the variable becomes live.
      Fixups.push_back(MCFixup::create(Contents.size(), BE, FK_SecRel_4));
-      Contents.resize(Contents.size() + 4); // Fixup for code start.
+      LEWriter.write<uint32_t>(0); // Fixup for code start.
      // Make space for a fixup that will record the section index for the code.
      Fixups.push_back(MCFixup::create(Contents.size(), BE, FK_SecRel_2));
-      Contents.resize(Contents.size() + 2); // Fixup for section index.
+      LEWriter.write<uint16_t>(0); // Fixup for section index.
      // Write down the range's extent.
      LEWriter.write<uint16_t>(Chunk);

@ -529,7 +529,7 @@ void CodeViewContext::encodeDefRange(MCAsmLayout &Layout,
    } while (RangeSize > 0);

    // Emit the gaps afterwards.
-    assert((NumGaps == 0 || Bias < MaxDefRange) &&
+    assert((NumGaps == 0 || Bias <= MaxDefRange) &&
           "large ranges should not have gaps");
    unsigned GapStartOffset = GapAndRangeSizes[I].second;
    for (++I; I != J; ++I) {
@ -537,7 +537,7 @@ void CodeViewContext::encodeDefRange(MCAsmLayout &Layout,
      assert(I < GapAndRangeSizes.size());
      std::tie(GapSize, RangeSize) = GapAndRangeSizes[I];
      LEWriter.write<uint16_t>(GapStartOffset);
-      LEWriter.write<uint16_t>(RangeSize);
+      LEWriter.write<uint16_t>(GapSize);
      GapStartOffset += GapSize + RangeSize;
    }
  }
--- a/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/lib/Target/AArch64/AArch64ISelLowering.cpp
@ -8934,8 +8934,9 @@ static SDValue splitStoreSplat(SelectionDAG &DAG, StoreSDNode &St,
  // instructions (stp).
  SDLoc DL(&St);
  SDValue BasePtr = St.getBasePtr();
+  const MachinePointerInfo &PtrInfo = St.getPointerInfo();
  SDValue NewST1 =
-      DAG.getStore(St.getChain(), DL, SplatVal, BasePtr, St.getPointerInfo(),
+      DAG.getStore(St.getChain(), DL, SplatVal, BasePtr, PtrInfo,
                   OrigAlignment, St.getMemOperand()->getFlags());

  unsigned Offset = EltOffset;
@ -8944,7 +8945,7 @@ static SDValue splitStoreSplat(SelectionDAG &DAG, StoreSDNode &St,
    SDValue OffsetPtr = DAG.getNode(ISD::ADD, DL, MVT::i64, BasePtr,
                                    DAG.getConstant(Offset, DL, MVT::i64));
    NewST1 = DAG.getStore(NewST1.getValue(0), DL, SplatVal, OffsetPtr,
-                          St.getPointerInfo(), Alignment,
+                          PtrInfo.getWithOffset(Offset), Alignment,
                          St.getMemOperand()->getFlags());
    Offset += EltOffset;
  }
--- a/lib/Target/X86/X86ExpandPseudo.cpp
+++ b/lib/Target/X86/X86ExpandPseudo.cpp
@ -77,11 +77,9 @@ bool X86ExpandPseudo::ExpandMI(MachineBasicBlock &MBB,
  default:
    return false;
  case X86::TCRETURNdi:
-  case X86::TCRETURNdicc:
  case X86::TCRETURNri:
  case X86::TCRETURNmi:
  case X86::TCRETURNdi64:
-  case X86::TCRETURNdi64cc:
  case X86::TCRETURNri64:
  case X86::TCRETURNmi64: {
    bool isMem = Opcode == X86::TCRETURNmi || Opcode == X86::TCRETURNmi64;
@ -99,10 +97,6 @@ bool X86ExpandPseudo::ExpandMI(MachineBasicBlock &MBB,
    Offset = StackAdj - MaxTCDelta;
    assert(Offset >= 0 && "Offset should never be negative");

-    if (Opcode == X86::TCRETURNdicc || Opcode == X86::TCRETURNdi64cc) {
-      assert(Offset == 0 && "Conditional tail call cannot adjust the stack.");
-    }
-
    if (Offset) {
      // Check for possible merge with preceding ADD instruction.
      Offset += X86FL->mergeSPUpdates(MBB, MBBI, true);
@ -111,21 +105,12 @@ bool X86ExpandPseudo::ExpandMI(MachineBasicBlock &MBB,

    // Jump to label or value in register.
    bool IsWin64 = STI->isTargetWin64();
-    if (Opcode == X86::TCRETURNdi || Opcode == X86::TCRETURNdicc ||
-        Opcode == X86::TCRETURNdi64 || Opcode == X86::TCRETURNdi64cc) {
+    if (Opcode == X86::TCRETURNdi || Opcode == X86::TCRETURNdi64) {
      unsigned Op;
      switch (Opcode) {
      case X86::TCRETURNdi:
        Op = X86::TAILJMPd;
        break;
-      case X86::TCRETURNdicc:
-        Op = X86::TAILJMPd_CC;
-        break;
-      case X86::TCRETURNdi64cc:
-        assert(!IsWin64 && "Conditional tail calls confuse the Win64 unwinder.");
-        // TODO: We could do it for Win64 "leaf" functions though; PR30337.
-        Op = X86::TAILJMPd64_CC;
-        break;
      default:
        // Note: Win64 uses REX prefixes indirect jumps out of functions, but
        // not direct ones.
@ -141,10 +126,6 @@ bool X86ExpandPseudo::ExpandMI(MachineBasicBlock &MBB,
        MIB.addExternalSymbol(JumpTarget.getSymbolName(),
                              JumpTarget.getTargetFlags());
      }
-      if (Op == X86::TAILJMPd_CC || Op == X86::TAILJMPd64_CC) {
-        MIB.addImm(MBBI->getOperand(2).getImm());
-      }
-
    } else if (Opcode == X86::TCRETURNmi || Opcode == X86::TCRETURNmi64) {
      unsigned Op = (Opcode == X86::TCRETURNmi)
                        ? X86::TAILJMPm
--- a/lib/Target/X86/X86InstrControl.td
+++ b/lib/Target/X86/X86InstrControl.td
@ -264,21 +264,6 @@ let isCall = 1, isTerminator = 1, isReturn = 1, isBarrier = 1,
                   "jmp{l}\t{*}$dst", [], IIC_JMP_MEM>;
 }

-// Conditional tail calls are similar to the above, but they are branches
-// rather than barriers, and they use EFLAGS.
-let isCall = 1, isTerminator = 1, isReturn = 1, isBranch = 1,
-    isCodeGenOnly = 1, SchedRW = [WriteJumpLd] in
-  let Uses = [ESP, EFLAGS] in {
-  def TCRETURNdicc : PseudoI<(outs),
-                     (ins i32imm_pcrel:$dst, i32imm:$offset, i32imm:$cond), []>;
-
-  // This gets substituted to a conditional jump instruction in MC lowering.
-  def TAILJMPd_CC : Ii32PCRel<0x80, RawFrm, (outs),
-                           (ins i32imm_pcrel:$dst, i32imm:$cond),
-                           "",
-                           [], IIC_JMP_REL>;
-}
-

 //===----------------------------------------------------------------------===//
 //  Call Instructions...
@ -340,19 +325,3 @@ let isCall = 1, isTerminator = 1, isReturn = 1, isBarrier = 1,
                           "rex64 jmp{q}\t{*}$dst", [], IIC_JMP_MEM>;
  }
 }
-
-// Conditional tail calls are similar to the above, but they are branches
-// rather than barriers, and they use EFLAGS.
-let isCall = 1, isTerminator = 1, isReturn = 1, isBranch = 1,
-    isCodeGenOnly = 1, SchedRW = [WriteJumpLd] in
-  let Uses = [RSP, EFLAGS] in {
-  def TCRETURNdi64cc : PseudoI<(outs),
-                           (ins i64i32imm_pcrel:$dst, i32imm:$offset,
-                            i32imm:$cond), []>;
-
-  // This gets substituted to a conditional jump instruction in MC lowering.
-  def TAILJMPd64_CC : Ii32PCRel<0x80, RawFrm, (outs),
-                           (ins i64i32imm_pcrel:$dst, i32imm:$cond),
-                           "",
-                           [], IIC_JMP_REL>;
-}
--- a/lib/Target/X86/X86InstrInfo.cpp
+++ b/lib/Target/X86/X86InstrInfo.cpp
@ -5108,85 +5108,6 @@ bool X86InstrInfo::isUnpredicatedTerminator(const MachineInstr &MI) const {
  return !isPredicated(MI);
 }

-bool X86InstrInfo::isUnconditionalTailCall(const MachineInstr &MI) const {
-  switch (MI.getOpcode()) {
-  case X86::TCRETURNdi:
-  case X86::TCRETURNri:
-  case X86::TCRETURNmi:
-  case X86::TCRETURNdi64:
-  case X86::TCRETURNri64:
-  case X86::TCRETURNmi64:
-    return true;
-  default:
-    return false;
-  }
-}
-
-bool X86InstrInfo::canMakeTailCallConditional(
-    SmallVectorImpl<MachineOperand> &BranchCond,
-    const MachineInstr &TailCall) const {
-  if (TailCall.getOpcode() != X86::TCRETURNdi &&
-      TailCall.getOpcode() != X86::TCRETURNdi64) {
-    // Only direct calls can be done with a conditional branch.
-    return false;
-  }
-
-  if (Subtarget.isTargetWin64()) {
-    // Conditional tail calls confuse the Win64 unwinder.
-    // TODO: Allow them for "leaf" functions; PR30337.
-    return false;
-  }
-
-  assert(BranchCond.size() == 1);
-  if (BranchCond[0].getImm() > X86::LAST_VALID_COND) {
-    // Can't make a conditional tail call with this condition.
-    return false;
-  }
-
-  const X86MachineFunctionInfo *X86FI =
-      TailCall.getParent()->getParent()->getInfo<X86MachineFunctionInfo>();
-  if (X86FI->getTCReturnAddrDelta() != 0 ||
-      TailCall.getOperand(1).getImm() != 0) {
-    // A conditional tail call cannot do any stack adjustment.
-    return false;
-  }
-
-  return true;
-}
-
-void X86InstrInfo::replaceBranchWithTailCall(
-    MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &BranchCond,
-    const MachineInstr &TailCall) const {
-  assert(canMakeTailCallConditional(BranchCond, TailCall));
-
-  MachineBasicBlock::iterator I = MBB.end();
-  while (I != MBB.begin()) {
-    --I;
-    if (I->isDebugValue())
-      continue;
-    if (!I->isBranch())
-      assert(0 && "Can't find the branch to replace!");
-
-    X86::CondCode CC = getCondFromBranchOpc(I->getOpcode());
-    assert(BranchCond.size() == 1);
-    if (CC != BranchCond[0].getImm())
-      continue;
-
-    break;
-  }
-
-  unsigned Opc = TailCall.getOpcode() == X86::TCRETURNdi ? X86::TCRETURNdicc
-                                                         : X86::TCRETURNdi64cc;
-
-  auto MIB = BuildMI(MBB, I, MBB.findDebugLoc(I), get(Opc));
-  MIB->addOperand(TailCall.getOperand(0)); // Destination.
-  MIB.addImm(0); // Stack offset (not used).
-  MIB->addOperand(BranchCond[0]); // Condition.
-  MIB.copyImplicitOps(TailCall); // Regmask and (imp-used) parameters.
-
-  I->eraseFromParent();
-}
-
 // Given a MBB and its TBB, find the FBB which was a fallthrough MBB (it may
 // not be a fallthrough MBB now due to layout changes). Return nullptr if the
 // fallthrough MBB cannot be identified.
--- a/lib/Target/X86/X86InstrInfo.h
+++ b/lib/Target/X86/X86InstrInfo.h
@ -316,13 +316,6 @@ public:

  // Branch analysis.
  bool isUnpredicatedTerminator(const MachineInstr &MI) const override;
-  bool isUnconditionalTailCall(const MachineInstr &MI) const override;
-  bool canMakeTailCallConditional(SmallVectorImpl<MachineOperand> &Cond,
-                                  const MachineInstr &TailCall) const override;
-  void replaceBranchWithTailCall(MachineBasicBlock &MBB,
-                                 SmallVectorImpl<MachineOperand> &Cond,
-                                 const MachineInstr &TailCall) const override;
-
  bool analyzeBranch(MachineBasicBlock &MBB, MachineBasicBlock *&TBB,
                     MachineBasicBlock *&FBB,
                     SmallVectorImpl<MachineOperand> &Cond,
--- a/lib/Target/X86/X86MCInstLower.cpp
+++ b/lib/Target/X86/X86MCInstLower.cpp
@ -498,16 +498,11 @@ ReSimplify:
    break;
  }

-  // TAILJMPd, TAILJMPd64, TailJMPd_cc - Lower to the correct jump instruction.
+  // TAILJMPd, TAILJMPd64 - Lower to the correct jump instruction.
  { unsigned Opcode;
  case X86::TAILJMPr:   Opcode = X86::JMP32r; goto SetTailJmpOpcode;
  case X86::TAILJMPd:
  case X86::TAILJMPd64: Opcode = X86::JMP_1;  goto SetTailJmpOpcode;
-  case X86::TAILJMPd_CC:
-  case X86::TAILJMPd64_CC:
-    Opcode = X86::GetCondBranchFromCond(
-        static_cast<X86::CondCode>(MI->getOperand(1).getImm()));
-    goto SetTailJmpOpcode;

  SetTailJmpOpcode:
    MCOperand Saved = OutMI.getOperand(0);
@ -1281,11 +1276,9 @@ void X86AsmPrinter::EmitInstruction(const MachineInstr *MI) {
  case X86::TAILJMPr:
  case X86::TAILJMPm:
  case X86::TAILJMPd:
-  case X86::TAILJMPd_CC:
  case X86::TAILJMPr64:
  case X86::TAILJMPm64:
  case X86::TAILJMPd64:
-  case X86::TAILJMPd64_CC:
  case X86::TAILJMPr64_REX:
  case X86::TAILJMPm64_REX:
    // Lower these as normal, but add some comments.
--- a/test/Bitcode/DIGlobalVariableExpression.ll
+++ b/test/Bitcode/DIGlobalVariableExpression.ll
@ -7,12 +7,16 @@
 ; CHECK: @h = common global i32 0, align 4, !dbg ![[H:[0-9]+]]
 ; CHECK: ![[G]] = {{.*}}!DIGlobalVariableExpression(var: ![[GVAR:[0-9]+]], expr: ![[GEXPR:[0-9]+]])
 ; CHECK: ![[GVAR]] = distinct !DIGlobalVariable(name: "g",
+; CHECK: DICompileUnit({{.*}}, imports: ![[IMPORTS:[0-9]+]]
 ; CHECK: !DIGlobalVariableExpression(var: ![[CVAR:[0-9]+]], expr: ![[CEXPR:[0-9]+]])
 ; CHECK: ![[CVAR]] = distinct !DIGlobalVariable(name: "c",
 ; CHECK: ![[CEXPR]] = !DIExpression(DW_OP_constu, 23, DW_OP_stack_value)
-; CHECK: ![[H]] = {{.*}}!DIGlobalVariableExpression(var: ![[HVAR:[0-9]+]])
-; CHECK: ![[HVAR]] = distinct !DIGlobalVariable(name: "h",
+; CHECK: ![[HVAR:[0-9]+]] = distinct !DIGlobalVariable(name: "h",
+; CHECK: ![[IMPORTS]] = !{![[CIMPORT:[0-9]+]]}
+; CHECK: ![[CIMPORT]] = !DIImportedEntity({{.*}}entity: ![[HVAR]]
 ; CHECK: ![[GEXPR]] = !DIExpression(DW_OP_plus, 1)
+; CHECK: ![[H]] = {{.*}}!DIGlobalVariableExpression(var: ![[HVAR]])
+
@g = common global i32 0, align 4, !dbg !0
@h = common global i32 0, align 4, !dbg !11

@ -21,9 +25,9 @@
 !llvm.ident = !{!9}

 !0 = distinct !DIGlobalVariable(name: "g", scope: !1, file: !2, line: 1, type: !5, isLocal: false, isDefinition: true, expr: !DIExpression(DW_OP_plus, 1))
-!1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2, producer: "clang version 4.0.0 (trunk 286129) (llvm/trunk 286128)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !3, globals: !4)
+!1 = distinct !DICompileUnit(language: DW_LANG_C99, file: !2, producer: "clang version 4.0.0 (trunk 286129) (llvm/trunk 286128)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, globals: !4, imports: !3)
 !2 = !DIFile(filename: "a.c", directory: "/")
-!3 = !{}
+!3 = !{!12}
 !4 = !{!0, !10, !11}
 !5 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
 !6 = !{i32 2, !"Dwarf Version", i32 4}
@ -32,3 +36,4 @@
 !9 = !{!"clang version 4.0.0 (trunk 286129) (llvm/trunk 286128)"}
 !10 = distinct !DIGlobalVariable(name: "c", scope: !1, file: !2, line: 1, type: !5, isLocal: false, isDefinition: true, expr: !DIExpression(DW_OP_constu, 23, DW_OP_stack_value))
 !11 = distinct !DIGlobalVariable(name: "h", scope: !1, file: !2, line: 2, type: !5, isLocal: false, isDefinition: true)
+!12 = !DIImportedEntity(tag: DW_TAG_imported_declaration, line: 1, scope: !1, entity: !11)
--- a/test/Bitcode/DIGlobalVariableExpression.ll.bc
+++ b/test/Bitcode/DIGlobalVariableExpression.ll.bc
--- a/test/Bitcode/dityperefs-3.8.ll
+++ b/test/Bitcode/dityperefs-3.8.ll
@ -18,14 +18,13 @@
 ; CHECK-NEXT: !7 = !DILocalVariable(name: "V1", scope: !6, type: !2)
 ; CHECK-NEXT: !8 = !DIObjCProperty(name: "P1", type: !1)
 ; CHECK-NEXT: !9 = !DITemplateTypeParameter(type: !1)
-; CHECK-NEXT: !10 = distinct !DIGlobalVariableExpression(var: !11)
-; CHECK-NEXT: !11 = !DIGlobalVariable(name: "G",{{.*}} type: !1,
-; CHECK-NEXT: !12 = !DITemplateValueParameter(type: !1, value: i32* @G1)
-; CHECK-NEXT: !13 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "T2", scope: !0, entity: !1)
-; CHECK-NEXT: !14 = !DICompositeType(tag: DW_TAG_structure_type, name: "T3", file: !0, elements: !15, identifier: "T3")
-; CHECK-NEXT: !15 = !{!16}
-; CHECK-NEXT: !16 = !DISubprogram(scope: !14,
-; CHECK-NEXT: !17 = !DIDerivedType(tag: DW_TAG_ptr_to_member_type,{{.*}} extraData: !14)
+; CHECK-NEXT: !10 = !DIGlobalVariable(name: "G",{{.*}} type: !1,
+; CHECK-NEXT: !11 = !DITemplateValueParameter(type: !1, value: i32* @G1)
+; CHECK-NEXT: !12 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "T2", scope: !0, entity: !1)
+; CHECK-NEXT: !13 = !DICompositeType(tag: DW_TAG_structure_type, name: "T3", file: !0, elements: !14, identifier: "T3")
+; CHECK-NEXT: !14 = !{!15}
+; CHECK-NEXT: !15 = !DISubprogram(scope: !13,
+; CHECK-NEXT: !16 = !DIDerivedType(tag: DW_TAG_ptr_to_member_type,{{.*}} extraData: !13)

 !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir")
 !1 = !DICompositeType(tag: DW_TAG_structure_type, name: "T1", file: !0, identifier: "T1")
--- a/test/CodeGen/AArch64/ldst-zero.ll
+++ b/test/CodeGen/AArch64/ldst-zero.ll
@ -0,0 +1,74 @@
+; RUN: llc -mtriple=aarch64 -mcpu=cortex-a53 < %s | FileCheck %s
+
+; Tests to check that zero stores which are generated as STP xzr, xzr aren't
+; scheduled incorrectly due to incorrect alias information
+
+declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i32, i1)
+%struct.tree_common = type { i8*, i8*, i32 }
+
+; Original test case which exhibited the bug
+define void @test1(%struct.tree_common* %t, i32 %code, i8* %type) {
+; CHECK-LABEL: test1:
+; CHECK: stp xzr, xzr, [x0, #8]
+; CHECK: stp xzr, x2, [x0]
+; CHECK: str w1, [x0, #16]
+entry:
+  %0 = bitcast %struct.tree_common* %t to i8*
+  tail call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 24, i32 8, i1 false)
+  %code1 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 2
+  store i32 %code, i32* %code1, align 8
+  %type2 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 1
+  store i8* %type, i8** %type2, align 8
+  ret void
+}
+
+; Store to each struct element instead of using memset
+define void @test2(%struct.tree_common* %t, i32 %code, i8* %type) {
+; CHECK-LABEL: test2:
+; CHECK: stp xzr, xzr, [x0]
+; CHECK: str wzr, [x0, #16]
+; CHECK: str w1, [x0, #16]
+; CHECK: str x2, [x0, #8]
+entry:
+  %0 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 0
+  %1 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 1
+  %2 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 2
+  store i8* zeroinitializer, i8** %0, align 8
+  store i8* zeroinitializer, i8** %1, align 8
+  store i32 zeroinitializer, i32* %2, align 8
+  store i32 %code, i32* %2, align 8
+  store i8* %type, i8** %1, align 8
+  ret void
+}
+
+; Vector store instead of memset
+define void @test3(%struct.tree_common* %t, i32 %code, i8* %type) {
+; CHECK-LABEL: test3:
+; CHECK: stp xzr, xzr, [x0, #8]
+; CHECK: stp xzr, x2, [x0]
+; CHECK: str w1, [x0, #16]
+entry:
+  %0 = bitcast %struct.tree_common* %t to <3 x i64>*
+  store <3 x i64> zeroinitializer, <3 x i64>* %0, align 8
+  %code1 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 2
+  store i32 %code, i32* %code1, align 8
+  %type2 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 1
+  store i8* %type, i8** %type2, align 8
+  ret void
+}
+
+; Vector store, then store to vector elements
+define void @test4(<3 x i64>* %p, i64 %x, i64 %y) {
+; CHECK-LABEL: test4:
+; CHECK: stp xzr, xzr, [x0, #8]
+; CHECK: stp xzr, x2, [x0]
+; CHECK: str x1, [x0, #16]
+entry:
+  store <3 x i64> zeroinitializer, <3 x i64>* %p, align 8
+  %0 = bitcast <3 x i64>* %p to i64*
+  %1 = getelementptr inbounds i64, i64* %0, i64 2
+  store i64 %x, i64* %1, align 8
+  %2 = getelementptr inbounds i64, i64* %0, i64 1
+  store i64 %y, i64* %2, align 8
+  ret void
+}
--- a/test/CodeGen/AArch64/misched-stp.ll
+++ b/test/CodeGen/AArch64/misched-stp.ll
@ -0,0 +1,57 @@
+; REQUIRES: asserts
+; RUN: llc < %s -mtriple=aarch64 -mcpu=cyclone -mattr=+use-aa -enable-misched -verify-misched -debug-only=misched -o - 2>&1 > /dev/null | FileCheck %s
+
+; Tests to check that the scheduler dependencies derived from alias analysis are
+; correct when we have loads that have been split up so that they can later be
+; merged into STP.
+
+; CHECK: ********** MI Scheduling **********
+; CHECK: test_splat:BB#0 entry
+; CHECK: SU({{[0-9]+}}):   STRWui %vreg{{[0-9]+}}, %vreg{{[0-9]+}}, 3; mem:ST4[%3+8]
+; CHECK: Successors:
+; CHECK-NEXT: ord  [[SU1:SU\([0-9]+\)]]
+; CHECK: SU({{[0-9]+}}):   STRWui %vreg{{[0-9]+}}, %vreg{{[0-9]+}}, 2; mem:ST4[%3+4]
+; CHECK: Successors:
+; CHECK-NEXT: ord  [[SU2:SU\([0-9]+\)]]
+; CHECK: [[SU1]]:   STRWui %vreg{{[0-9]+}}, %vreg{{[0-9]+}}, 3; mem:ST4[%2]
+; CHECK: [[SU2]]:   STRWui %vreg{{[0-9]+}}, %vreg{{[0-9]+}}, 2; mem:ST4[%1]
+define void @test_splat(i32 %x, i32 %y, i32* %p) {
+entry:
+  %val = load i32, i32* %p, align 4
+  %0 = getelementptr inbounds i32, i32* %p, i64 1
+  %1 = getelementptr inbounds i32, i32* %p, i64 2
+  %2 = getelementptr inbounds i32, i32* %p, i64 3
+  %vec0 = insertelement <4 x i32> undef, i32 %val, i32 0
+  %vec1 = insertelement <4 x i32> %vec0, i32 %val, i32 1
+  %vec2 = insertelement <4 x i32> %vec1, i32 %val, i32 2
+  %vec3 = insertelement <4 x i32> %vec2, i32 %val, i32 3
+  %3 = bitcast i32* %0 to <4 x i32>*
+  store <4 x i32> %vec3, <4 x i32>* %3, align 4
+  store i32 %x, i32* %2, align 4
+  store i32 %y, i32* %1, align 4
+  ret void
+}
+
+declare void @llvm.memset.p0i8.i64(i8* nocapture, i8, i64, i32, i1)
+%struct.tree_common = type { i8*, i8*, i32 }
+
+; CHECK: ********** MI Scheduling **********
+; CHECK: test_zero:BB#0 entry
+; CHECK: SU({{[0-9]+}}):   STRXui %XZR, %vreg{{[0-9]+}}, 2; mem:ST8[%0+16]
+; CHECK: Successors:
+; CHECK-NEXT: ord  [[SU3:SU\([0-9]+\)]]
+; CHECK: SU({{[0-9]+}}):   STRXui %XZR, %vreg{{[0-9]+}}, 1; mem:ST8[%0+8]
+; CHECK: Successors:
+; CHECK-NEXT: ord  [[SU4:SU\([0-9]+\)]]
+; CHECK: [[SU3]]:   STRWui %vreg{{[0-9]+}}, %vreg{{[0-9]+}}, 4; mem:ST4[%code1]
+; CHECK: [[SU4]]:   STRXui %vreg{{[0-9]+}}, %vreg{{[0-9]+}}, 1; mem:ST8[%type2]
+define void @test_zero(%struct.tree_common* %t, i32 %code, i8* %type) {
+entry:
+  %0 = bitcast %struct.tree_common* %t to i8*
+  tail call void @llvm.memset.p0i8.i64(i8* %0, i8 0, i64 24, i32 8, i1 false)
+  %code1 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 2
+  store i32 %code, i32* %code1, align 8
+  %type2 = getelementptr inbounds %struct.tree_common, %struct.tree_common* %t, i64 0, i32 1
+  store i8* %type, i8** %type2, align 8
+  ret void
+}
--- a/test/CodeGen/AArch64/regcoal-physreg.mir
+++ b/test/CodeGen/AArch64/regcoal-physreg.mir
@ -1,19 +1,24 @@
 # RUN: llc -mtriple=aarch64-apple-ios -run-pass=simple-register-coalescing %s -o - | FileCheck %s
 --- |
-  define void @func() { ret void }
+  define void @func0() { ret void }
+  define void @func1() { ret void }
+  define void @func2() { ret void }
 ...
 ---
 # Check coalescing of COPYs from reserved physregs.
-# CHECK-LABEL: name: func
-name: func
+# CHECK-LABEL: name: func0
+name: func0
 registers:
-  - { id: 0, class: gpr32 }
-  - { id: 1, class: gpr64 }
-  - { id: 2, class: gpr64 }
-  - { id: 3, class: gpr32 }
-  - { id: 4, class: gpr64 }
-  - { id: 5, class: gpr32 }
-  - { id: 6, class: xseqpairsclass }
+ - { id: 0, class: gpr32 }
+ - { id: 1, class: gpr64 }
+ - { id: 2, class: gpr64 }
+ - { id: 3, class: gpr32 }
+ - { id: 4, class: gpr64 }
+ - { id: 5, class: gpr32 }
+ - { id: 6, class: xseqpairsclass }
+ - { id: 7, class: gpr64 }
+ - { id: 8, class: gpr64sp }
+ - { id: 9, class: gpr64sp }
 body: |
  bb.0:
    ; We usually should not coalesce copies from allocatable physregs.
@ -60,8 +65,74 @@ body: |
    ; Only coalesce when the source register is reserved as a whole (this is
    ; a limitation of the current code which cannot update liveness information
    ; of the non-reserved part).
-    ; CHECK: %6 = COPY %xzr_x0
+    ; CHECK: %6 = COPY %x28_fp
    ; CHECK: HINT 0, implicit %6
-    %6 = COPY %xzr_x0
+    %6 = COPY %x28_fp
    HINT 0, implicit %6
+
+    ; This can be coalesced.
+    ; CHECK: %fp = SUBXri %fp, 4, 0
+    %8 = SUBXri %fp, 4, 0
+    %fp = COPY %8
+
+    ; Cannot coalesce when there are reads of the physreg.
+    ; CHECK-NOT: %fp = SUBXri %fp, 8, 0
+    ; CHECK: %9 = SUBXri %fp, 8, 0
+    ; CHECK: STRXui %fp, %fp, 0
+    ; CHECK: %fp = COPY %9
+    %9 = SUBXri %fp, 8, 0
+    STRXui %fp, %fp, 0
+    %fp = COPY %9
+...
+---
+# Check coalescing of COPYs from reserved physregs.
+# CHECK-LABEL: name: func1
+name: func1
+registers:
+ - { id: 0, class: gpr64sp }
+body: |
+  bb.0:
+    successors: %bb.1, %bb.2
+    ; Cannot coalesce physreg because we have reads on other CFG paths (we
+    ; currently abort for any control flow)
+    ; CHECK-NOT: %fp = SUBXri
+    ; CHECK: %0 = SUBXri %fp, 12, 0
+    ; CHECK: CBZX undef %x0, %bb.1
+    ; CHECK: B %bb.2
+    %0 = SUBXri %fp, 12, 0
+    CBZX undef %x0, %bb.1
+    B %bb.2
+
+  bb.1:
+    %fp = COPY %0
+    RET_ReallyLR
+
+  bb.2:
+    STRXui %fp, %fp, 0
+    RET_ReallyLR
+...
+---
+# CHECK-LABEL: name: func2
+name: func2
+registers:
+  - { id: 0, class: gpr64sp }
+body: |
+  bb.0:
+    successors: %bb.1, %bb.2
+    ; We can coalesce copies from physreg to vreg across multiple blocks.
+    ; CHECK-NOT: COPY
+    ; CHECK: CBZX undef %x0, %bb.1
+    ; CHECK-NEXT: B %bb.2
+    %0 = COPY %fp
+    CBZX undef %x0, %bb.1
+    B %bb.2
+
+  bb.1:
+    ; CHECK: STRXui undef %x0, %fp, 0
+    ; CHECK-NEXT: RET_ReallyLR
+    STRXui undef %x0, %0, 0
+    RET_ReallyLR
+
+  bb.2:
+    RET_ReallyLR
 ...
--- a/test/CodeGen/ARM/machine-copyprop.mir
+++ b/test/CodeGen/ARM/machine-copyprop.mir
@ -0,0 +1,22 @@
+# RUN: llc -o - %s -mtriple=armv7s-- -run-pass=machine-cp | FileCheck %s
+---
+# Test that machine copy prop recognizes the implicit-def operands on a COPY
+# as clobbering the register.
+# CHECK-LABEL: name: func
+# CHECK: %d2 = VMOVv2i32 2, 14, _
+# CHECK: %s5 = COPY %s0, implicit %q1, implicit-def %q1
+# CHECK: VST1q32 %r0, 0, %q1, 14, _
+# The following two COPYs must not be removed
+# CHECK: %s4 = COPY %s20, implicit-def %q1
+# CHECK: %s5 = COPY %s0, implicit killed %d0, implicit %q1, implicit-def %q1
+# CHECK: VST1q32 %r2, 0, %q1, 14, _
+name: func
+body: |
+  bb.0:
+    %d2 = VMOVv2i32 2, 14, _
+    %s5 = COPY %s0, implicit %q1, implicit-def %q1
+    VST1q32 %r0, 0, %q1, 14, _
+    %s4 = COPY %s20, implicit-def %q1
+    %s5 = COPY %s0, implicit killed %d0, implicit %q1, implicit-def %q1
+    VST1q32 %r2, 0, %q1, 14, _
+...
--- a/test/CodeGen/X86/conditional-tailcall.ll
+++ b/test/CodeGen/X86/conditional-tailcall.ll
@ -1,53 +0,0 @@
-; RUN: llc < %s -mtriple=i686-linux -show-mc-encoding | FileCheck %s
-; RUN: llc < %s -mtriple=x86_64-linux -show-mc-encoding | FileCheck %s
-
-declare void @foo()
-declare void @bar()
-
-define void @f(i32 %x, i32 %y) optsize {
-entry:
-	%p = icmp eq i32 %x, %y
-  br i1 %p, label %bb1, label %bb2
-bb1:
-  tail call void @foo()
-  ret void
-bb2:
-  tail call void @bar()
-  ret void
-
-; CHECK-LABEL: f:
-; CHECK: cmp
-; CHECK: jne bar
-; Check that the asm doesn't just look good, but uses the correct encoding.
-; CHECK: encoding: [0x75,A]
-; CHECK: jmp foo
-}
-
-
-declare x86_thiscallcc zeroext i1 @baz(i8*, i32)
-define x86_thiscallcc zeroext i1 @BlockPlacementTest(i8* %this, i32 %x) optsize {
-entry:
-  %and = and i32 %x, 42
-  %tobool = icmp eq i32 %and, 0
-  br i1 %tobool, label %land.end, label %land.rhs
-
-land.rhs:
-  %and6 = and i32 %x, 44
-  %tobool7 = icmp eq i32 %and6, 0
-  br i1 %tobool7, label %lor.rhs, label %land.end
-
-lor.rhs:
-  %call = tail call x86_thiscallcc zeroext i1 @baz(i8* %this, i32 %x) #2
-  br label %land.end
-
-land.end:
-  %0 = phi i1 [ false, %entry ], [ true, %land.rhs ], [ %call, %lor.rhs ]
-  ret i1 %0
-
-; Make sure machine block placement isn't confused by the conditional tail call,
-; but sees that it can fall through to the next block.
-; CHECK-LABEL: BlockPlacementTest
-; CHECK: je baz
-; CHECK-NOT: xor
-; CHECK: ret
-}
--- a/test/CodeGen/X86/shrink-compare.ll
+++ b/test/CodeGen/X86/shrink-compare.ll
@ -93,7 +93,8 @@ if.end:
 ; CHECK-LABEL: test2_1:
 ; CHECK: movzbl
 ; CHECK: cmpl $256
-; CHECK: je bar
+; CHECK: jne .LBB
+; CHECK: jmp bar
 define void @test2_1(i32 %X) nounwind minsize {
 entry:
  %and = and i32 %X, 255
@ -223,7 +224,8 @@ if.end:
 ; CHECK-LABEL: test_sext_i8_icmp_255:
 ; CHECK: movb $1,
 ; CHECK: testb
-; CHECK: je bar
+; CHECK: jne .LBB
+; CHECK: jmp bar
 define void @test_sext_i8_icmp_255(i8 %x) nounwind minsize {
 entry:
  %sext = sext i8 %x to i32
--- a/test/CodeGen/X86/tail-call-conditional.mir
+++ b/test/CodeGen/X86/tail-call-conditional.mir
@ -1,85 +0,0 @@
-# RUN: llc -mtriple x86_64-- -verify-machineinstrs -run-pass branch-folder -o - %s | FileCheck %s
-
-# Check the TCRETURNdi64cc optimization.
-
--- |
-  target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
-
-  define i64 @test(i64 %arg, i8* %arg1) optsize {
-    %tmp = icmp ult i64 %arg, 100
-    br i1 %tmp, label %1, label %4
-
-    %tmp3 = icmp ult i64 %arg, 10
-    br i1 %tmp3, label %2, label %3
-
-    %tmp5 = tail call i64 @f1(i8* %arg1, i64 %arg)
-    ret i64 %tmp5
-
-    %tmp7 = tail call i64 @f2(i8* %arg1, i64 %arg)
-    ret i64 %tmp7
-
-    ret i64 123
-  }
-
-  declare i64 @f1(i8*, i64)
-  declare i64 @f2(i8*, i64)
-
-...
---
-name: test
-tracksRegLiveness: true
-liveins:
-  - { reg: '%rdi' }
-  - { reg: '%rsi' }
-body:             |
-  bb.0:
-    successors: %bb.1, %bb.4
-    liveins: %rdi, %rsi
-
-    %rax = COPY %rdi
-    CMP64ri8 %rax, 99, implicit-def %eflags
-    JA_1 %bb.4, implicit %eflags
-    JMP_1 %bb.1
-
-  ; CHECK: bb.1:
-  ; CHECK-NEXT: successors: %bb.2({{[^)]+}}){{$}}
-  ; CHECK-NEXT: liveins: %rax, %rsi
-  ; CHECK-NEXT: {{^  $}}
-  ; CHECK-NEXT: %rdi = COPY %rsi
-  ; CHECK-NEXT: %rsi = COPY %rax
-  ; CHECK-NEXT: CMP64ri8 %rax, 9, implicit-def %eflags
-  ; CHECK-NEXT: TCRETURNdi64cc @f1, 0, 3, csr_64, implicit %rsp, implicit %eflags, implicit %rsp, implicit %rdi, implicit %rsi
-
-  bb.1:
-    successors: %bb.2, %bb.3
-    liveins: %rax, %rsi
-
-    CMP64ri8 %rax, 9, implicit-def %eflags
-    JA_1 %bb.3, implicit %eflags
-    JMP_1 %bb.2
-
-  bb.2:
-    liveins: %rax, %rsi
-
-    %rdi = COPY %rsi
-    %rsi = COPY %rax
-
-    TCRETURNdi64 @f1, 0, csr_64, implicit %rsp, implicit %rdi, implicit %rsi
-
-  ; CHECK: bb.2:
-  ; CHECK-NEXT: liveins: %rax, %rdi, %rsi
-  ; CHECK-NEXT: {{^  $}}
-  ; CHECK-NEXT: TCRETURNdi64 @f2, 0, csr_64, implicit %rsp, implicit %rdi, implicit %rsi
-
-  bb.3:
-    liveins: %rax, %rsi
-
-    %rdi = COPY %rsi
-    %rsi = COPY %rax
-    TCRETURNdi64 @f2, 0, csr_64, implicit %rsp, implicit %rdi, implicit %rsi
-
-  bb.4:
-    dead %eax = MOV32ri64 123, implicit-def %rax
-    RET 0, %rax
-
-...
--- a/test/MC/COFF/cv-def-range-gap.s
+++ b/test/MC/COFF/cv-def-range-gap.s
@ -37,6 +37,19 @@
 # CHECK-NEXT:        ISectStart: 0x0
 # CHECK-NEXT:        Range: 0x1
 # CHECK-NEXT:      }
+# CHECK-NEXT:    }
+# CHECK-NEXT:    DefRangeRegister {
+# CHECK-NEXT:      Register: 23
+# CHECK-NEXT:      MayHaveNoName: 0
+# CHECK-NEXT:      LocalVariableAddrRange {
+# CHECK-NEXT:        OffsetStart: .text+0x2001C
+# CHECK-NEXT:        ISectStart: 0x0
+# CHECK-NEXT:        Range: 0xF000
+# CHECK-NEXT:      }
+# CHECK-NEXT:      LocalVariableAddrGap [
+# CHECK-NEXT:        GapStartOffset: 0x1
+# CHECK-NEXT:        Range: 0xEFFE
+# CHECK-NEXT:      ]
 # CHECK-NEXT:    }

 	.text
@ -62,6 +75,16 @@ f:                                      # @f
 .Lbegin3:
 	nop
 .Lend3:
+
+	# Create a range that is exactly 0xF000 bytes long with a gap in the
+	# middle.
+.Lbegin4:
+	nop
+.Lend4:
+	.fill 0xeffe, 1, 0x90
+.Lbegin5:
+	nop
+.Lend5:
 	ret
 .Lfunc_end0:

@ -94,6 +117,7 @@ f:                                      # @f
 	.asciz	"p"
 .Ltmp19:
 	.cv_def_range	 .Lbegin0 .Lend0 .Lbegin1 .Lend1 .Lbegin2 .Lend2 .Lbegin3 .Lend3, "A\021\027\000\000\000"
+	.cv_def_range	 .Lbegin4 .Lend4 .Lbegin5 .Lend5, "A\021\027\000\000\000"
 	.short	2                       # Record length
 	.short	4431                    # Record kind: S_PROC_ID_END
 .Ltmp15:
--- a/unittests/ADT/IListIteratorTest.cpp
+++ b/unittests/ADT/IListIteratorTest.cpp
@ -131,4 +131,44 @@ TEST(IListIteratorTest, CheckEraseReverse) {
  EXPECT_EQ(L.rend(), RI);
 }

+TEST(IListIteratorTest, ReverseConstructor) {
+  simple_ilist<Node> L;
+  const simple_ilist<Node> &CL = L;
+  Node A, B;
+  L.insert(L.end(), A);
+  L.insert(L.end(), B);
+
+  // Save typing.
+  typedef simple_ilist<Node>::iterator iterator;
+  typedef simple_ilist<Node>::reverse_iterator reverse_iterator;
+  typedef simple_ilist<Node>::const_iterator const_iterator;
+  typedef simple_ilist<Node>::const_reverse_iterator const_reverse_iterator;
+
+  // Check conversion values.
+  EXPECT_EQ(L.begin(), iterator(L.rend()));
+  EXPECT_EQ(++L.begin(), iterator(++L.rbegin()));
+  EXPECT_EQ(L.end(), iterator(L.rbegin()));
+  EXPECT_EQ(L.rbegin(), reverse_iterator(L.end()));
+  EXPECT_EQ(++L.rbegin(), reverse_iterator(++L.begin()));
+  EXPECT_EQ(L.rend(), reverse_iterator(L.begin()));
+
+  // Check const iterator constructors.
+  EXPECT_EQ(CL.begin(), const_iterator(L.rend()));
+  EXPECT_EQ(CL.begin(), const_iterator(CL.rend()));
+  EXPECT_EQ(CL.rbegin(), const_reverse_iterator(L.end()));
+  EXPECT_EQ(CL.rbegin(), const_reverse_iterator(CL.end()));
+
+  // Confirm lack of implicit conversions.
+  static_assert(!std::is_convertible<iterator, reverse_iterator>::value,
+                "unexpected implicit conversion");
+  static_assert(!std::is_convertible<reverse_iterator, iterator>::value,
+                "unexpected implicit conversion");
+  static_assert(
+      !std::is_convertible<const_iterator, const_reverse_iterator>::value,
+      "unexpected implicit conversion");
+  static_assert(
+      !std::is_convertible<const_reverse_iterator, const_iterator>::value,
+      "unexpected implicit conversion");
+}
+
 } // end namespace
--- a/unittests/CodeGen/MachineInstrBundleIteratorTest.cpp
+++ b/unittests/CodeGen/MachineInstrBundleIteratorTest.cpp
@ -130,4 +130,68 @@ TEST(MachineInstrBundleIteratorTest, CompareToBundledMI) {
  ASSERT_TRUE(CI != CMBI.getIterator());
 }

+struct MyUnbundledInstr
+    : ilist_node<MyUnbundledInstr, ilist_sentinel_tracking<true>> {
+  bool isBundledWithPred() const { return false; }
+  bool isBundledWithSucc() const { return false; }
+};
+typedef MachineInstrBundleIterator<MyUnbundledInstr> unbundled_iterator;
+typedef MachineInstrBundleIterator<const MyUnbundledInstr>
+    const_unbundled_iterator;
+typedef MachineInstrBundleIterator<MyUnbundledInstr, true>
+    reverse_unbundled_iterator;
+typedef MachineInstrBundleIterator<const MyUnbundledInstr, true>
+    const_reverse_unbundled_iterator;
+
+TEST(MachineInstrBundleIteratorTest, ReverseConstructor) {
+  simple_ilist<MyUnbundledInstr, ilist_sentinel_tracking<true>> L;
+  const auto &CL = L;
+  MyUnbundledInstr A, B;
+  L.insert(L.end(), A);
+  L.insert(L.end(), B);
+
+  // Save typing.
+  typedef MachineInstrBundleIterator<MyUnbundledInstr> iterator;
+  typedef MachineInstrBundleIterator<MyUnbundledInstr, true> reverse_iterator;
+  typedef MachineInstrBundleIterator<const MyUnbundledInstr> const_iterator;
+  typedef MachineInstrBundleIterator<const MyUnbundledInstr, true>
+      const_reverse_iterator;
+
+  // Convert to bundle iterators.
+  auto begin = [&]() -> iterator { return L.begin(); };
+  auto end = [&]() -> iterator { return L.end(); };
+  auto rbegin = [&]() -> reverse_iterator { return L.rbegin(); };
+  auto rend = [&]() -> reverse_iterator { return L.rend(); };
+  auto cbegin = [&]() -> const_iterator { return CL.begin(); };
+  auto cend = [&]() -> const_iterator { return CL.end(); };
+  auto crbegin = [&]() -> const_reverse_iterator { return CL.rbegin(); };
+  auto crend = [&]() -> const_reverse_iterator { return CL.rend(); };
+
+  // Check conversion values.
+  EXPECT_EQ(begin(), iterator(rend()));
+  EXPECT_EQ(++begin(), iterator(++rbegin()));
+  EXPECT_EQ(end(), iterator(rbegin()));
+  EXPECT_EQ(rbegin(), reverse_iterator(end()));
+  EXPECT_EQ(++rbegin(), reverse_iterator(++begin()));
+  EXPECT_EQ(rend(), reverse_iterator(begin()));
+
+  // Check const iterator constructors.
+  EXPECT_EQ(cbegin(), const_iterator(rend()));
+  EXPECT_EQ(cbegin(), const_iterator(crend()));
+  EXPECT_EQ(crbegin(), const_reverse_iterator(end()));
+  EXPECT_EQ(crbegin(), const_reverse_iterator(cend()));
+
+  // Confirm lack of implicit conversions.
+  static_assert(!std::is_convertible<iterator, reverse_iterator>::value,
+                "unexpected implicit conversion");
+  static_assert(!std::is_convertible<reverse_iterator, iterator>::value,
+                "unexpected implicit conversion");
+  static_assert(
+      !std::is_convertible<const_iterator, const_reverse_iterator>::value,
+      "unexpected implicit conversion");
+  static_assert(
+      !std::is_convertible<const_reverse_iterator, const_iterator>::value,
+      "unexpected implicit conversion");
+}
+
 } // end namespace
--- a/utils/release/build_llvm_package.bat
+++ b/utils/release/build_llvm_package.bat
@ -47,7 +47,6 @@ svn.exe export -r %revision% http://llvm.org/svn/llvm-project/clang-tools-extra/
 svn.exe export -r %revision% http://llvm.org/svn/llvm-project/lld/%branch% llvm/tools/lld || exit /b
 svn.exe export -r %revision% http://llvm.org/svn/llvm-project/compiler-rt/%branch% llvm/projects/compiler-rt || exit /b
 svn.exe export -r %revision% http://llvm.org/svn/llvm-project/openmp/%branch% llvm/projects/openmp || exit /b
-svn.exe export -r %revision% http://llvm.org/svn/llvm-project/lldb/%branch% llvm/tools/lldb || exit /b


 REM Setting CMAKE_CL_SHOWINCLUDES_PREFIX to work around PR27226.