Vendor import of llvm trunk r304222:

https://llvm.org/svn/llvm-project/llvm/trunk@304222
This commit is contained in:
Dimitry Andric 2017-05-30 17:37:31 +00:00
parent ab44ce3d59
commit ee2f195dd3
166 changed files with 16655 additions and 1441 deletions
docs
include/llvm
lib
test/CodeGen

@ -0,0 +1,182 @@
==================
Vectorization Plan
==================
.. contents::
:local:
Abstract
========
The vectorization transformation can be rather complicated, involving several
potential alternatives, especially for outer-loops [1]_ but also possibly for
innermost loops. These alternatives may have significant performance impact,
both positive and negative. A cost model is therefore employed to identify the
best alternative, including the alternative of avoiding any transformation
altogether.
The Vectorization Plan is an explicit model for describing vectorization
candidates. It serves for both optimizing candidates including estimating their
cost reliably, and for performing their final translation into IR. This
facilitates dealing with multiple vectorization candidates.
High-level Design
=================
Vectorization Workflow
----------------------
VPlan-based vectorization involves three major steps, taking a "scenario-based
approach" to vectorization planning:
1. Legal Step: check if a loop can be legally vectorized; encode contraints and
artifacts if so.
2. Plan Step:
a. Build initial VPlans following the constraints and decisions taken by
Legal Step 1, and compute their cost.
b. Apply optimizations to the VPlans, possibly forking additional VPlans.
Prune sub-optimal VPlans having relatively high cost.
3. Execute Step: materialize the best VPlan. Note that this is the only step
that modifies the IR.
Design Guidelines
-----------------
In what follows, the term "input IR" refers to code that is fed into the
vectorizer whereas the term "output IR" refers to code that is generated by the
vectorizer. The output IR contains code that has been vectorized or "widened"
according to a loop Vectorization Factor (VF), and/or loop unroll-and-jammed
according to an Unroll Factor (UF).
The design of VPlan follows several high-level guidelines:
1. Analysis-like: building and manipulating VPlans must not modify the input IR.
In particular, if the best option is not to vectorize at all, the
vectorization process terminates before reaching Step 3, and compilation
should proceed as if VPlans had not been built.
2. Align Cost & Execute: each VPlan must support both estimating the cost and
generating the output IR code, such that the cost estimation evaluates the
to-be-generated code reliably.
3. Support vectorizing additional constructs:
a. Outer-loop vectorization. In particular, VPlan must be able to model the
control-flow of the output IR which may include multiple basic-blocks and
nested loops.
b. SLP vectorization.
c. Combinations of the above, including nested vectorization: vectorizing
both an inner loop and an outer-loop at the same time (each with its own
VF and UF), mixed vectorization: vectorizing a loop with SLP patterns
inside [4]_, (re)vectorizing input IR containing vector code.
d. Function vectorization [2]_.
4. Support multiple candidates efficiently. In particular, similar candidates
related to a range of possible VF's and UF's must be represented efficiently.
Potential versioning needs to be supported efficiently.
5. Support vectorizing idioms, such as interleaved groups of strided loads or
stores. This is achieved by modeling a sequence of output instructions using
a "Recipe", which is responsible for computing its cost and generating its
code.
6. Encapsulate Single-Entry Single-Exit regions (SESE). During vectorization
such regions may need to be, for example, predicated and linearized, or
replicated VF*UF times to handle scalarized and predicated instructions.
Innerloops are also modelled as SESE regions.
Low-level Design
================
The low-level design of VPlan comprises of the following classes.
:LoopVectorizationPlanner:
A LoopVectorizationPlanner is designed to handle the vectorization of a loop
or a loop nest. It can construct, optimize and discard one or more VPlans,
each VPlan modelling a distinct way to vectorize the loop or the loop nest.
Once the best VPlan is determined, including the best VF and UF, this VPlan
drives the generation of output IR.
:VPlan:
A model of a vectorized candidate for a given input IR loop or loop nest. This
candidate is represented using a Hierarchical CFG. VPlan supports estimating
the cost and driving the generation of the output IR code it represents.
:Hierarchical CFG:
A control-flow graph whose nodes are basic-blocks or Hierarchical CFG's. The
Hierarchical CFG data structure is similar to the Tile Tree [5]_, where
cross-Tile edges are lifted to connect Tiles instead of the original
basic-blocks as in Sharir [6]_, promoting the Tile encapsulation. The terms
Region and Block are used rather than Tile [5]_ to avoid confusion with loop
tiling.
:VPBlockBase:
The building block of the Hierarchical CFG. A pure-virtual base-class of
VPBasicBlock and VPRegionBlock, see below. VPBlockBase models the hierarchical
control-flow relations with other VPBlocks. Note that in contrast to the IR
BasicBlock, a VPBlockBase models its control-flow successors and predecessors
directly, rather than through a Terminator branch or through predecessor
branches that "use" the VPBlockBase.
:VPBasicBlock:
VPBasicBlock is a subclass of VPBlockBase, and serves as the leaves of the
Hierarchical CFG. It represents a sequence of output IR instructions that will
appear consecutively in an output IR basic-block. The instructions of this
basic-block originate from one or more VPBasicBlocks. VPBasicBlock holds a
sequence of zero or more VPRecipes that model the cost and generation of the
output IR instructions.
:VPRegionBlock:
VPRegionBlock is a subclass of VPBlockBase. It models a collection of
VPBasicBlocks and VPRegionBlocks which form a SESE subgraph of the output IR
CFG. A VPRegionBlock may indicate that its contents are to be replicated a
constant number of times when output IR is generated, effectively representing
a loop with constant trip-count that will be completely unrolled. This is used
to support scalarized and predicated instructions with a single model for
multiple candidate VF's and UF's.
:VPRecipeBase:
A pure-virtual base class modeling a sequence of one or more output IR
instructions, possibly based on one or more input IR instructions. These
input IR instructions are referred to as "Ingredients" of the Recipe. A Recipe
may specify how its ingredients are to be transformed to produce the output IR
instructions; e.g., cloned once, replicated multiple times or widened
according to selected VF.
:VPTransformState:
Stores information used for generating output IR, passed from
LoopVectorizationPlanner to its selected VPlan for execution, and used to pass
additional information down to VPBlocks and VPRecipes.
Related LLVM components
-----------------------
1. SLP Vectorizer: one can compare the VPlan model with LLVM's existing SLP
tree, where TSLP [3]_ adds Plan Step 2.b.
2. RegionInfo: one can compare VPlan's H-CFG with the Region Analysis as used by
Polly [7]_.
References
----------
.. [1] "Outer-loop vectorization: revisited for short SIMD architectures", Dorit
Nuzman and Ayal Zaks, PACT 2008.
.. [2] "Proposal for function vectorization and loop vectorization with function
calls", Xinmin Tian, [`cfe-dev
<http://lists.llvm.org/pipermail/cfe-dev/2016-March/047732.html>`_].,
March 2, 2016.
See also `review <https://reviews.llvm.org/D22792>`_.
.. [3] "Throttling Automatic Vectorization: When Less is More", Vasileios
Porpodas and Tim Jones, PACT 2015 and LLVM Developers' Meeting 2015.
.. [4] "Exploiting mixed SIMD parallelism by reducing data reorganization
overhead", Hao Zhou and Jingling Xue, CGO 2016.
.. [5] "Register Allocation via Hierarchical Graph Coloring", David Callahan and
Brian Koblenz, PLDI 1991
.. [6] "Structural analysis: A new approach to flow analysis in optimizing
compilers", M. Sharir, Journal of Computer Languages, Jan. 1980
.. [7] "Enabling Polyhedral Optimizations in LLVM", Tobias Grosser, Diploma
thesis, 2011.
.. [8] "Introducing VPlan to the Loop Vectorizer", Gil Rapaport and Ayal Zaks,
European LLVM Developers' Meeting 2017.

@ -382,6 +382,17 @@ And Linpack-pc with the same configuration. Result is Mflops, higher is better.
.. image:: linpack-pc.png
Ongoing Development Directions
------------------------------
.. toctree::
:hidden:
Proposals/VectorizationPlan
:doc:`Proposals/VectorizationPlan`
Modeling the process and upgrading the infrastructure of LLVM's Loop Vectorizer.
.. _slp-vectorizer:
The SLP Vectorizer

@ -528,6 +528,7 @@ can be better.
CodeOfConduct
Proposals/GitHubMove
Proposals/VectorizationPlan
:doc:`CodeOfConduct`
Proposal to adopt a code of conduct on the LLVM social spaces (lists, events,
@ -536,6 +537,8 @@ can be better.
:doc:`Proposals/GitHubMove`
Proposal to move from SVN/Git to GitHub.
:doc:`Proposals/VectorizationPlan`
Proposal to model the process and upgrade the infrastructure of LLVM's Loop Vectorizer.
Indices and tables
==================

@ -1536,8 +1536,7 @@ public:
/// Determine if the SCEV can be evaluated at loop's entry. It is true if it
/// doesn't depend on a SCEVUnknown of an instruction which is dominated by
/// the header of loop L.
bool isAvailableAtLoopEntry(const SCEV *S, const Loop *L, DominatorTree &DT,
LoopInfo &LI);
bool isAvailableAtLoopEntry(const SCEV *S, const Loop *L);
/// Return true if the given SCEV changes value in a known way in the
/// specified loop. This property being true implies that the value is

@ -383,11 +383,11 @@ private:
return;
#define HANDLE_DIEVALUE_SMALL(T) \
case is##T: \
destruct<DIE##T>();
destruct<DIE##T>(); \
return;
#define HANDLE_DIEVALUE_LARGE(T) \
case is##T: \
destruct<const DIE##T *>();
destruct<const DIE##T *>(); \
return;
#include "llvm/CodeGen/DIEValue.def"
}

@ -13,6 +13,8 @@
#include <cinttypes>
#include <type_traits>
#include "llvm/Support/Endian.h"
namespace llvm {
namespace codeview {
@ -291,7 +293,7 @@ enum class ModifierOptions : uint16_t {
};
CV_DEFINE_ENUM_CLASS_FLAGS_OPERATORS(ModifierOptions)
enum class ModuleDebugFragmentKind : uint32_t {
enum class DebugSubsectionKind : uint32_t {
None = 0,
Symbols = 0xf1,
Lines = 0xf2,
@ -550,6 +552,24 @@ enum LineFlags : uint16_t {
LF_None = 0,
LF_HaveColumns = 1, // CV_LINES_HAVE_COLUMNS
};
/// Data in the the SUBSEC_FRAMEDATA subection.
struct FrameData {
support::ulittle32_t RvaStart;
support::ulittle32_t CodeSize;
support::ulittle32_t LocalSize;
support::ulittle32_t ParamsSize;
support::ulittle32_t MaxStackSize;
support::ulittle32_t FrameFunc;
support::ulittle16_t PrologSize;
support::ulittle16_t SavedRegsSize;
support::ulittle32_t Flags;
enum : uint32_t {
HasSEH = 1 << 0,
HasEH = 1 << 1,
IsFunctionStart = 1 << 2,
};
};
}
}

@ -1,4 +1,4 @@
//===- ModuleDebugFileChecksumFragment.h ------------------------*- C++ -*-===//
//===- DebugChecksumsSubsection.h -------------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
@ -7,12 +7,12 @@
//
//===----------------------------------------------------------------------===//
#ifndef LLVM_DEBUGINFO_CODEVIEW_MODULEDEBUGFILECHECKSUMFRAGMENT_H
#define LLVM_DEBUGINFO_CODEVIEW_MODULEDEBUGFILECHECKSUMFRAGMENT_H
#ifndef LLVM_DEBUGINFO_CODEVIEW_DEBUGCHECKSUMSSUBSECTION_H
#define LLVM_DEBUGINFO_CODEVIEW_DEBUGCHECKSUMSSUBSECTION_H
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/DenseMap.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugFragment.h"
#include "llvm/DebugInfo/CodeView/DebugSubsection.h"
#include "llvm/Support/Allocator.h"
#include "llvm/Support/BinaryStreamArray.h"
#include "llvm/Support/BinaryStreamReader.h"
@ -21,7 +21,7 @@
namespace llvm {
namespace codeview {
class StringTable;
class DebugStringTableSubsection;
struct FileChecksumEntry {
uint32_t FileNameOffset; // Byte offset of filename in global stringtable.
@ -43,19 +43,22 @@ public:
namespace llvm {
namespace codeview {
class ModuleDebugFileChecksumFragmentRef final : public ModuleDebugFragmentRef {
class DebugChecksumsSubsectionRef final : public DebugSubsectionRef {
typedef VarStreamArray<codeview::FileChecksumEntry> FileChecksumArray;
typedef FileChecksumArray::Iterator Iterator;
public:
ModuleDebugFileChecksumFragmentRef()
: ModuleDebugFragmentRef(ModuleDebugFragmentKind::FileChecksums) {}
DebugChecksumsSubsectionRef()
: DebugSubsectionRef(DebugSubsectionKind::FileChecksums) {}
static bool classof(const ModuleDebugFragmentRef *S) {
return S->kind() == ModuleDebugFragmentKind::FileChecksums;
static bool classof(const DebugSubsectionRef *S) {
return S->kind() == DebugSubsectionKind::FileChecksums;
}
bool valid() const { return Checksums.valid(); }
Error initialize(BinaryStreamReader Reader);
Error initialize(BinaryStreamRef Stream);
Iterator begin() { return Checksums.begin(); }
Iterator end() { return Checksums.end(); }
@ -66,23 +69,23 @@ private:
FileChecksumArray Checksums;
};
class ModuleDebugFileChecksumFragment final : public ModuleDebugFragment {
class DebugChecksumsSubsection final : public DebugSubsection {
public:
explicit ModuleDebugFileChecksumFragment(StringTable &Strings);
explicit DebugChecksumsSubsection(DebugStringTableSubsection &Strings);
static bool classof(const ModuleDebugFragment *S) {
return S->kind() == ModuleDebugFragmentKind::FileChecksums;
static bool classof(const DebugSubsection *S) {
return S->kind() == DebugSubsectionKind::FileChecksums;
}
void addChecksum(StringRef FileName, FileChecksumKind Kind,
ArrayRef<uint8_t> Bytes);
uint32_t calculateSerializedLength() override;
Error commit(BinaryStreamWriter &Writer) override;
uint32_t calculateSerializedSize() const override;
Error commit(BinaryStreamWriter &Writer) const override;
uint32_t mapChecksumOffset(StringRef FileName) const;
private:
StringTable &Strings;
DebugStringTableSubsection &Strings;
DenseMap<uint32_t, uint32_t> OffsetMap;
uint32_t SerializedSize = 0;

@ -0,0 +1,59 @@
//===- DebugFrameDataSubsection.h ------------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
#ifndef LLVM_DEBUGINFO_CODEVIEW_DEBUGFRAMEDATASUBSECTION_H
#define LLVM_DEBUGINFO_CODEVIEW_DEBUGFRAMEDATASUBSECTION_H
#include "llvm/DebugInfo/CodeView/CodeView.h"
#include "llvm/DebugInfo/CodeView/DebugSubsection.h"
#include "llvm/Support/BinaryStreamReader.h"
#include "llvm/Support/Error.h"
namespace llvm {
namespace codeview {
class DebugFrameDataSubsectionRef final : public DebugSubsectionRef {
public:
DebugFrameDataSubsectionRef()
: DebugSubsectionRef(DebugSubsectionKind::FrameData) {}
static bool classof(const DebugSubsection *S) {
return S->kind() == DebugSubsectionKind::FrameData;
}
Error initialize(BinaryStreamReader Reader);
FixedStreamArray<FrameData>::Iterator begin() const { return Frames.begin(); }
FixedStreamArray<FrameData>::Iterator end() const { return Frames.end(); }
const void *getRelocPtr() const { return RelocPtr; }
private:
const uint32_t *RelocPtr = nullptr;
FixedStreamArray<FrameData> Frames;
};
class DebugFrameDataSubsection final : public DebugSubsection {
public:
DebugFrameDataSubsection()
: DebugSubsection(DebugSubsectionKind::FrameData) {}
static bool classof(const DebugSubsection *S) {
return S->kind() == DebugSubsectionKind::FrameData;
}
uint32_t calculateSerializedSize() const override;
Error commit(BinaryStreamWriter &Writer) const override;
void addFrameData(const FrameData &Frame);
private:
std::vector<FrameData> Frames;
};
}
}
#endif

@ -1,4 +1,4 @@
//===- ModuleDebugInlineeLinesFragment.h ------------------------*- C++ -*-===//
//===- DebugInlineeLinesSubsection.h ----------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
@ -7,11 +7,11 @@
//
//===----------------------------------------------------------------------===//
#ifndef LLVM_DEBUGINFO_CODEVIEW_MODULEDEBUGINLINEELINESFRAGMENT_H
#define LLVM_DEBUGINFO_CODEVIEW_MODULEDEBUGINLINEELINESFRAGMENT_H
#ifndef LLVM_DEBUGINFO_CODEVIEW_BUGINLINEELINESSUBSECTION_H
#define LLVM_DEBUGINFO_CODEVIEW_BUGINLINEELINESSUBSECTION_H
#include "llvm/DebugInfo/CodeView/DebugSubsection.h"
#include "llvm/DebugInfo/CodeView/Line.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugFragment.h"
#include "llvm/Support/BinaryStreamArray.h"
#include "llvm/Support/BinaryStreamReader.h"
#include "llvm/Support/Error.h"
@ -19,9 +19,8 @@
namespace llvm {
namespace codeview {
class ModuleDebugInlineeLineFragmentRef;
class ModuleDebugFileChecksumFragment;
class StringTable;
class DebugInlineeLinesSubsectionsRef;
class DebugChecksumsSubsection;
enum class InlineeLinesSignature : uint32_t {
Normal, // CV_INLINEE_SOURCE_LINE_SIGNATURE
@ -51,15 +50,15 @@ template <> struct VarStreamArrayExtractor<codeview::InlineeSourceLine> {
};
namespace codeview {
class ModuleDebugInlineeLineFragmentRef final : public ModuleDebugFragmentRef {
class DebugInlineeLinesSubsectionRef final : public DebugSubsectionRef {
typedef VarStreamArray<InlineeSourceLine> LinesArray;
typedef LinesArray::Iterator Iterator;
public:
ModuleDebugInlineeLineFragmentRef();
DebugInlineeLinesSubsectionRef();
static bool classof(const ModuleDebugFragmentRef *S) {
return S->kind() == ModuleDebugFragmentKind::InlineeLines;
static bool classof(const DebugSubsectionRef *S) {
return S->kind() == DebugSubsectionKind::InlineeLines;
}
Error initialize(BinaryStreamReader Reader);
@ -73,23 +72,23 @@ private:
VarStreamArray<InlineeSourceLine> Lines;
};
class ModuleDebugInlineeLineFragment final : public ModuleDebugFragment {
class DebugInlineeLinesSubsection final : public DebugSubsection {
public:
ModuleDebugInlineeLineFragment(ModuleDebugFileChecksumFragment &Checksums,
bool HasExtraFiles);
DebugInlineeLinesSubsection(DebugChecksumsSubsection &Checksums,
bool HasExtraFiles);
static bool classof(const ModuleDebugFragment *S) {
return S->kind() == ModuleDebugFragmentKind::InlineeLines;
static bool classof(const DebugSubsection *S) {
return S->kind() == DebugSubsectionKind::InlineeLines;
}
Error commit(BinaryStreamWriter &Writer) override;
uint32_t calculateSerializedLength() override;
Error commit(BinaryStreamWriter &Writer) const override;
uint32_t calculateSerializedSize() const override;
void addInlineSite(TypeIndex FuncId, StringRef FileName, uint32_t SourceLine);
void addExtraFile(StringRef FileName);
private:
ModuleDebugFileChecksumFragment &Checksums;
DebugChecksumsSubsection &Checksums;
bool HasExtraFiles = false;
uint32_t ExtraFileCount = 0;

@ -1,4 +1,4 @@
//===- ModuleDebugLineFragment.h --------------------------------*- C++ -*-===//
//===- DebugLinesSubsection.h --------------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
@ -10,8 +10,8 @@
#ifndef LLVM_DEBUGINFO_CODEVIEW_MODULEDEBUGLINEFRAGMENT_H
#define LLVM_DEBUGINFO_CODEVIEW_MODULEDEBUGLINEFRAGMENT_H
#include "llvm/DebugInfo/CodeView/DebugSubsection.h"
#include "llvm/DebugInfo/CodeView/Line.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugFragment.h"
#include "llvm/Support/BinaryStreamArray.h"
#include "llvm/Support/BinaryStreamReader.h"
#include "llvm/Support/Error.h"
@ -19,8 +19,8 @@
namespace llvm {
namespace codeview {
class ModuleDebugFileChecksumFragment;
class StringTable;
class DebugChecksumsSubsection;
class DebugStringTableSubsection;
// Corresponds to the `CV_DebugSLinesHeader_t` structure.
struct LineFragmentHeader {
@ -70,16 +70,16 @@ public:
LineColumnEntry &Item, const LineFragmentHeader *Ctx);
};
class ModuleDebugLineFragmentRef final : public ModuleDebugFragmentRef {
class DebugLinesSubsectionRef final : public DebugSubsectionRef {
friend class LineColumnExtractor;
typedef VarStreamArray<LineColumnEntry, LineColumnExtractor> LineInfoArray;
typedef LineInfoArray::Iterator Iterator;
public:
ModuleDebugLineFragmentRef();
DebugLinesSubsectionRef();
static bool classof(const ModuleDebugFragmentRef *S) {
return S->kind() == ModuleDebugFragmentKind::Lines;
static bool classof(const DebugSubsectionRef *S) {
return S->kind() == DebugSubsectionKind::Lines;
}
Error initialize(BinaryStreamReader Reader);
@ -96,7 +96,7 @@ private:
LineInfoArray LinesAndColumns;
};
class ModuleDebugLineFragment final : public ModuleDebugFragment {
class DebugLinesSubsection final : public DebugSubsection {
struct Block {
Block(uint32_t ChecksumBufferOffset)
: ChecksumBufferOffset(ChecksumBufferOffset) {}
@ -107,11 +107,11 @@ class ModuleDebugLineFragment final : public ModuleDebugFragment {
};
public:
ModuleDebugLineFragment(ModuleDebugFileChecksumFragment &Checksums,
StringTable &Strings);
DebugLinesSubsection(DebugChecksumsSubsection &Checksums,
DebugStringTableSubsection &Strings);
static bool classof(const ModuleDebugFragment *S) {
return S->kind() == ModuleDebugFragmentKind::Lines;
static bool classof(const DebugSubsection *S) {
return S->kind() == DebugSubsectionKind::Lines;
}
void createBlock(StringRef FileName);
@ -119,8 +119,8 @@ public:
void addLineAndColumnInfo(uint32_t Offset, const LineInfo &Line,
uint32_t ColStart, uint32_t ColEnd);
uint32_t calculateSerializedLength() override;
Error commit(BinaryStreamWriter &Writer) override;
uint32_t calculateSerializedSize() const override;
Error commit(BinaryStreamWriter &Writer) const override;
void setRelocationAddress(uint16_t Segment, uint16_t Offset);
void setCodeSize(uint32_t Size);
@ -129,7 +129,7 @@ public:
bool hasColumnInfo() const;
private:
ModuleDebugFileChecksumFragment &Checksums;
DebugChecksumsSubsection &Checksums;
uint16_t RelocOffset = 0;
uint16_t RelocSegment = 0;

@ -1,4 +1,4 @@
//===- StringTable.h - CodeView String Table Reader/Writer ------*- C++ -*-===//
//===- DebugStringTableSubsection.h - CodeView String Table -----*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
@ -7,12 +7,12 @@
//
//===----------------------------------------------------------------------===//
#ifndef LLVM_DEBUGINFO_CODEVIEW_STRINGTABLE_H
#define LLVM_DEBUGINFO_CODEVIEW_STRINGTABLE_H
#ifndef LLVM_DEBUGINFO_CODEVIEW_DEBUGSTRINGTABLESUBSECTION_H
#define LLVM_DEBUGINFO_CODEVIEW_DEBUGSTRINGTABLESUBSECTION_H
#include "llvm/ADT/StringMap.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/DebugInfo/CodeView/DebugSubsection.h"
#include "llvm/Support/BinaryStreamRef.h"
#include "llvm/Support/Error.h"
@ -28,11 +28,15 @@ namespace codeview {
/// Represents a read-only view of a CodeView string table. This is a very
/// simple flat buffer consisting of null-terminated strings, where strings
/// are retrieved by their offset in the buffer. StringTableRef does not own
/// the underlying storage for the buffer.
class StringTableRef {
/// are retrieved by their offset in the buffer. DebugStringTableSubsectionRef
/// does not own the underlying storage for the buffer.
class DebugStringTableSubsectionRef : public DebugSubsectionRef {
public:
StringTableRef();
DebugStringTableSubsectionRef();
static bool classof(const DebugSubsectionRef *S) {
return S->kind() == DebugSubsectionKind::StringTable;
}
Error initialize(BinaryStreamRef Contents);
@ -44,11 +48,18 @@ private:
BinaryStreamRef Stream;
};
/// Represents a read-write view of a CodeView string table. StringTable owns
/// the underlying storage for the table, and is capable of serializing the
/// string table into a format understood by StringTableRef.
class StringTable {
/// Represents a read-write view of a CodeView string table.
/// DebugStringTableSubsection owns the underlying storage for the table, and is
/// capable of serializing the string table into a format understood by
/// DebugStringTableSubsectionRef.
class DebugStringTableSubsection : public DebugSubsection {
public:
DebugStringTableSubsection();
static bool classof(const DebugSubsection *S) {
return S->kind() == DebugSubsectionKind::StringTable;
}
// If string S does not exist in the string table, insert it.
// Returns the ID for S.
uint32_t insert(StringRef S);
@ -56,8 +67,8 @@ public:
// Return the ID for string S. Assumes S exists in the table.
uint32_t getStringId(StringRef S) const;
uint32_t calculateSerializedSize() const;
Error commit(BinaryStreamWriter &Writer) const;
uint32_t calculateSerializedSize() const override;
Error commit(BinaryStreamWriter &Writer) const override;
uint32_t size() const;

@ -0,0 +1,52 @@
//===- DebugSubsection.h ------------------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
#ifndef LLVM_DEBUGINFO_CODEVIEW_MODULEDEBUGFRAGMENT_H
#define LLVM_DEBUGINFO_CODEVIEW_MODULEDEBUGFRAGMENT_H
#include "llvm/DebugInfo/CodeView/CodeView.h"
#include "llvm/Support/BinaryStreamWriter.h"
#include "llvm/Support/Casting.h"
namespace llvm {
namespace codeview {
class DebugSubsectionRef {
public:
explicit DebugSubsectionRef(DebugSubsectionKind Kind) : Kind(Kind) {}
virtual ~DebugSubsectionRef();
static bool classof(const DebugSubsectionRef *S) { return true; }
DebugSubsectionKind kind() const { return Kind; }
protected:
DebugSubsectionKind Kind;
};
class DebugSubsection {
public:
explicit DebugSubsection(DebugSubsectionKind Kind) : Kind(Kind) {}
virtual ~DebugSubsection();
static bool classof(const DebugSubsection *S) { return true; }
DebugSubsectionKind kind() const { return Kind; }
virtual Error commit(BinaryStreamWriter &Writer) const = 0;
virtual uint32_t calculateSerializedSize() const = 0;
protected:
DebugSubsectionKind Kind;
};
} // namespace codeview
} // namespace llvm
#endif // LLVM_DEBUGINFO_CODEVIEW_MODULEDEBUGFRAGMENT_H

@ -1,4 +1,4 @@
//===- ModuleDebugFragment.h ------------------------------------*- C++ -*-===//
//===- DebugSubsection.h ------------------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
@ -20,52 +20,49 @@
namespace llvm {
namespace codeview {
class ModuleDebugFragment;
class DebugSubsection;
// Corresponds to the `CV_DebugSSubsectionHeader_t` structure.
struct ModuleDebugFragmentHeader {
support::ulittle32_t Kind; // codeview::ModuleDebugFragmentKind enum
struct DebugSubsectionHeader {
support::ulittle32_t Kind; // codeview::DebugSubsectionKind enum
support::ulittle32_t Length; // number of bytes occupied by this record.
};
class ModuleDebugFragmentRecord {
class DebugSubsectionRecord {
public:
ModuleDebugFragmentRecord();
ModuleDebugFragmentRecord(ModuleDebugFragmentKind Kind, BinaryStreamRef Data);
DebugSubsectionRecord();
DebugSubsectionRecord(DebugSubsectionKind Kind, BinaryStreamRef Data);
static Error initialize(BinaryStreamRef Stream,
ModuleDebugFragmentRecord &Info);
static Error initialize(BinaryStreamRef Stream, DebugSubsectionRecord &Info);
uint32_t getRecordLength() const;
ModuleDebugFragmentKind kind() const;
DebugSubsectionKind kind() const;
BinaryStreamRef getRecordData() const;
private:
ModuleDebugFragmentKind Kind;
DebugSubsectionKind Kind;
BinaryStreamRef Data;
};
class ModuleDebugFragmentRecordBuilder {
class DebugSubsectionRecordBuilder {
public:
ModuleDebugFragmentRecordBuilder(ModuleDebugFragmentKind Kind,
ModuleDebugFragment &Frag);
DebugSubsectionRecordBuilder(DebugSubsectionKind Kind, DebugSubsection &Frag);
uint32_t calculateSerializedLength();
Error commit(BinaryStreamWriter &Writer);
private:
ModuleDebugFragmentKind Kind;
ModuleDebugFragment &Frag;
DebugSubsectionKind Kind;
DebugSubsection &Frag;
};
} // namespace codeview
template <>
struct VarStreamArrayExtractor<codeview::ModuleDebugFragmentRecord> {
template <> struct VarStreamArrayExtractor<codeview::DebugSubsectionRecord> {
typedef void ContextType;
static Error extract(BinaryStreamRef Stream, uint32_t &Length,
codeview::ModuleDebugFragmentRecord &Info) {
if (auto EC = codeview::ModuleDebugFragmentRecord::initialize(Stream, Info))
codeview::DebugSubsectionRecord &Info) {
if (auto EC = codeview::DebugSubsectionRecord::initialize(Stream, Info))
return EC;
Length = Info.getRecordLength();
return Error::success();
@ -73,7 +70,7 @@ struct VarStreamArrayExtractor<codeview::ModuleDebugFragmentRecord> {
};
namespace codeview {
typedef VarStreamArray<ModuleDebugFragmentRecord> ModuleDebugFragmentArray;
typedef VarStreamArray<DebugSubsectionRecord> DebugSubsectionArray;
}
} // namespace llvm

@ -1,4 +1,4 @@
//===- ModuleDebugFragmentVisitor.h -----------------------------*- C++ -*-===//
//===- DebugSubsectionVisitor.h -----------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
@ -17,43 +17,41 @@ namespace llvm {
namespace codeview {
class ModuleDebugFileChecksumFragmentRef;
class ModuleDebugFragmentRecord;
class ModuleDebugInlineeLineFragmentRef;
class ModuleDebugLineFragmentRef;
class ModuleDebugUnknownFragmentRef;
class DebugChecksumsSubsectionRef;
class DebugSubsectionRecord;
class DebugInlineeLinesSubsectionRef;
class DebugLinesSubsectionRef;
class DebugUnknownSubsectionRef;
class ModuleDebugFragmentVisitor {
class DebugSubsectionVisitor {
public:
virtual ~ModuleDebugFragmentVisitor() = default;
virtual ~DebugSubsectionVisitor() = default;
virtual Error visitUnknown(ModuleDebugUnknownFragmentRef &Unknown) {
virtual Error visitUnknown(DebugUnknownSubsectionRef &Unknown) {
return Error::success();
}
virtual Error visitLines(ModuleDebugLineFragmentRef &Lines) {
virtual Error visitLines(DebugLinesSubsectionRef &Lines) {
return Error::success();
}
virtual Error
visitFileChecksums(ModuleDebugFileChecksumFragmentRef &Checksums) {
virtual Error visitFileChecksums(DebugChecksumsSubsectionRef &Checksums) {
return Error::success();
}
virtual Error visitInlineeLines(ModuleDebugInlineeLineFragmentRef &Inlinees) {
virtual Error visitInlineeLines(DebugInlineeLinesSubsectionRef &Inlinees) {
return Error::success();
}
virtual Error finished() { return Error::success(); }
};
Error visitModuleDebugFragment(const ModuleDebugFragmentRecord &R,
ModuleDebugFragmentVisitor &V);
Error visitDebugSubsection(const DebugSubsectionRecord &R,
DebugSubsectionVisitor &V);
template <typename T>
Error visitModuleDebugFragments(T &&FragmentRange,
ModuleDebugFragmentVisitor &V) {
Error visitDebugSubsections(T &&FragmentRange, DebugSubsectionVisitor &V) {
for (const auto &L : FragmentRange) {
if (auto EC = visitModuleDebugFragment(L, V))
if (auto EC = visitDebugSubsection(L, V))
return EC;
}
if (auto EC = V.finished())

@ -0,0 +1,53 @@
//===- DebugSymbolsSubsection.h --------------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
#ifndef LLVM_DEBUGINFO_CODEVIEW_DEBUGSYMBOLSSUBSECTION_H
#define LLVM_DEBUGINFO_CODEVIEW_DEBUGSYMBOLSSUBSECTION_H
#include "llvm/DebugInfo/CodeView/DebugSubsection.h"
#include "llvm/DebugInfo/CodeView/SymbolRecord.h"
#include "llvm/Support/Error.h"
namespace llvm {
namespace codeview {
class DebugSymbolsSubsectionRef final : public DebugSubsectionRef {
public:
DebugSymbolsSubsectionRef()
: DebugSubsectionRef(DebugSubsectionKind::Symbols) {}
static bool classof(const DebugSubsectionRef *S) {
return S->kind() == DebugSubsectionKind::Symbols;
}
Error initialize(BinaryStreamReader Reader);
private:
CVSymbolArray Records;
};
class DebugSymbolsSubsection final : public DebugSubsection {
public:
DebugSymbolsSubsection() : DebugSubsection(DebugSubsectionKind::Symbols) {}
static bool classof(const DebugSubsection *S) {
return S->kind() == DebugSubsectionKind::Symbols;
}
uint32_t calculateSerializedSize() const override;
Error commit(BinaryStreamWriter &Writer) const override;
void addSymbol(CVSymbol Symbol);
private:
uint32_t Length = 0;
std::vector<CVSymbol> Records;
};
}
}
#endif

@ -1,4 +1,4 @@
//===- ModuleDebugUnknownFragment.h -----------------------------*- C++ -*-===//
//===- DebugUnknownSubsection.h -----------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
@ -10,17 +10,16 @@
#ifndef LLVM_DEBUGINFO_CODEVIEW_MODULEDEBUGUNKNOWNFRAGMENT_H
#define LLVM_DEBUGINFO_CODEVIEW_MODULEDEBUGUNKNOWNFRAGMENT_H
#include "llvm/DebugInfo/CodeView/ModuleDebugFragment.h"
#include "llvm/DebugInfo/CodeView/DebugSubsection.h"
#include "llvm/Support/BinaryStreamRef.h"
namespace llvm {
namespace codeview {
class ModuleDebugUnknownFragmentRef final : public ModuleDebugFragmentRef {
class DebugUnknownSubsectionRef final : public DebugSubsectionRef {
public:
ModuleDebugUnknownFragmentRef(ModuleDebugFragmentKind Kind,
BinaryStreamRef Data)
: ModuleDebugFragmentRef(Kind), Data(Data) {}
DebugUnknownSubsectionRef(DebugSubsectionKind Kind, BinaryStreamRef Data)
: DebugSubsectionRef(Kind), Data(Data) {}
BinaryStreamRef getData() const { return Data; }

@ -1,48 +0,0 @@
//===- ModuleDebugFragment.h ------------------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
#ifndef LLVM_DEBUGINFO_CODEVIEW_MODULEDEBUGFRAGMENT_H
#define LLVM_DEBUGINFO_CODEVIEW_MODULEDEBUGFRAGMENT_H
#include "llvm/DebugInfo/CodeView/CodeView.h"
#include "llvm/Support/BinaryStreamWriter.h"
#include "llvm/Support/Casting.h"
namespace llvm {
namespace codeview {
class ModuleDebugFragmentRef {
public:
explicit ModuleDebugFragmentRef(ModuleDebugFragmentKind Kind) : Kind(Kind) {}
virtual ~ModuleDebugFragmentRef();
ModuleDebugFragmentKind kind() const { return Kind; }
protected:
ModuleDebugFragmentKind Kind;
};
class ModuleDebugFragment {
public:
explicit ModuleDebugFragment(ModuleDebugFragmentKind Kind) : Kind(Kind) {}
virtual ~ModuleDebugFragment();
ModuleDebugFragmentKind kind() const { return Kind; }
virtual Error commit(BinaryStreamWriter &Writer) = 0;
virtual uint32_t calculateSerializedLength() = 0;
protected:
ModuleDebugFragmentKind Kind;
};
} // namespace codeview
} // namespace llvm
#endif // LLVM_DEBUGINFO_CODEVIEW_MODULEDEBUGFRAGMENT_H

@ -19,7 +19,7 @@ class BinaryStreamReader;
namespace codeview {
class StringTableRef;
class DebugStringTableSubsectionRef;
class SymbolVisitorDelegate {
public:
@ -27,7 +27,7 @@ public:
virtual uint32_t getRecordOffset(BinaryStreamReader Reader) = 0;
virtual StringRef getFileNameForFileOffset(uint32_t FileOffset) = 0;
virtual StringTableRef getStringTable() = 0;
virtual DebugStringTableSubsectionRef getStringTable() = 0;
};
} // end namespace codeview

@ -11,9 +11,9 @@
#define LLVM_DEBUGINFO_PDB_RAW_DBIMODULEDESCRIPTORBUILDER_H
#include "llvm/ADT/StringRef.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugFileChecksumFragment.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugInlineeLinesFragment.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugLineFragment.h"
#include "llvm/DebugInfo/CodeView/DebugChecksumsSubsection.h"
#include "llvm/DebugInfo/CodeView/DebugInlineeLinesSubsection.h"
#include "llvm/DebugInfo/CodeView/DebugLinesSubsection.h"
#include "llvm/DebugInfo/CodeView/SymbolRecord.h"
#include "llvm/DebugInfo/PDB/Native/RawTypes.h"
#include "llvm/Support/Error.h"
@ -25,7 +25,7 @@ namespace llvm {
class BinaryStreamWriter;
namespace codeview {
class ModuleDebugFragmentRecordBuilder;
class DebugSubsectionRecordBuilder;
}
namespace msf {
@ -49,11 +49,11 @@ public:
void setObjFileName(StringRef Name);
void addSymbol(codeview::CVSymbol Symbol);
void addC13Fragment(std::unique_ptr<codeview::ModuleDebugLineFragment> Lines);
void addC13Fragment(std::unique_ptr<codeview::DebugLinesSubsection> Lines);
void addC13Fragment(
std::unique_ptr<codeview::ModuleDebugInlineeLineFragment> Inlinees);
std::unique_ptr<codeview::DebugInlineeLinesSubsection> Inlinees);
void setC13FileChecksums(
std::unique_ptr<codeview::ModuleDebugFileChecksumFragment> Checksums);
std::unique_ptr<codeview::DebugChecksumsSubsection> Checksums);
uint16_t getStreamIndex() const;
StringRef getModuleName() const { return ModuleName; }
@ -83,12 +83,11 @@ private:
std::vector<std::string> SourceFiles;
std::vector<codeview::CVSymbol> Symbols;
std::unique_ptr<codeview::ModuleDebugFileChecksumFragment> ChecksumInfo;
std::vector<std::unique_ptr<codeview::ModuleDebugLineFragment>> LineInfo;
std::vector<std::unique_ptr<codeview::ModuleDebugInlineeLineFragment>>
Inlinees;
std::unique_ptr<codeview::DebugChecksumsSubsection> ChecksumInfo;
std::vector<std::unique_ptr<codeview::DebugLinesSubsection>> LineInfo;
std::vector<std::unique_ptr<codeview::DebugInlineeLinesSubsection>> Inlinees;
std::vector<std::unique_ptr<codeview::ModuleDebugFragmentRecordBuilder>>
std::vector<std::unique_ptr<codeview::DebugSubsectionRecordBuilder>>
C13Builders;
ModuleInfoHeader Layout;

@ -10,7 +10,7 @@
#ifndef LLVM_DEBUGINFO_PDB_RAW_PDBDBISTREAM_H
#define LLVM_DEBUGINFO_PDB_RAW_PDBDBISTREAM_H
#include "llvm/DebugInfo/CodeView/ModuleDebugFragment.h"
#include "llvm/DebugInfo/CodeView/DebugSubsection.h"
#include "llvm/DebugInfo/MSF/MappedBlockStream.h"
#include "llvm/DebugInfo/PDB/Native/DbiModuleDescriptor.h"
#include "llvm/DebugInfo/PDB/Native/DbiModuleList.h"
@ -19,8 +19,6 @@
#include "llvm/DebugInfo/PDB/Native/RawTypes.h"
#include "llvm/DebugInfo/PDB/PDBTypes.h"
#include "llvm/Support/BinaryStreamArray.h"
#include "llvm/Support/BinaryStreamArray.h"
#include "llvm/Support/BinaryStreamRef.h"
#include "llvm/Support/BinaryStreamRef.h"
#include "llvm/Support/Endian.h"
#include "llvm/Support/Error.h"

@ -12,7 +12,7 @@
#include "llvm/ADT/iterator_range.h"
#include "llvm/DebugInfo/CodeView/CVRecord.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugFragmentRecord.h"
#include "llvm/DebugInfo/CodeView/DebugSubsectionRecord.h"
#include "llvm/DebugInfo/CodeView/SymbolRecord.h"
#include "llvm/DebugInfo/MSF/MappedBlockStream.h"
#include "llvm/Support/BinaryStreamArray.h"
@ -25,8 +25,7 @@ class PDBFile;
class DbiModuleDescriptor;
class ModuleDebugStreamRef {
typedef codeview::ModuleDebugFragmentArray::Iterator
LinesAndChecksumsIterator;
typedef codeview::DebugSubsectionArray::Iterator LinesAndChecksumsIterator;
public:
ModuleDebugStreamRef(const DbiModuleDescriptor &Module,
@ -58,7 +57,7 @@ private:
BinaryStreamRef C13LinesSubstream;
BinaryStreamRef GlobalRefsSubstream;
codeview::ModuleDebugFragmentArray LinesAndChecksums;
codeview::DebugSubsectionArray LinesAndChecksums;
};
}
}

@ -12,7 +12,7 @@
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/DebugInfo/CodeView/StringTable.h"
#include "llvm/DebugInfo/CodeView/DebugStringTableSubsection.h"
#include "llvm/Support/BinaryStreamArray.h"
#include "llvm/Support/BinaryStreamRef.h"
#include "llvm/Support/Endian.h"
@ -52,7 +52,7 @@ private:
Error readEpilogue(BinaryStreamReader &Reader);
const PDBStringTableHeader *Header = nullptr;
codeview::StringTableRef Strings;
codeview::DebugStringTableSubsectionRef Strings;
FixedStreamArray<support::ulittle32_t> IDs;
uint32_t ByteSize = 0;
uint32_t NameCount = 0;

@ -16,7 +16,7 @@
#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/DebugInfo/CodeView/StringTable.h"
#include "llvm/DebugInfo/CodeView/DebugStringTableSubsection.h"
#include "llvm/Support/Error.h"
#include <vector>
@ -41,8 +41,10 @@ public:
uint32_t calculateSerializedSize() const;
Error commit(BinaryStreamWriter &Writer) const;
codeview::StringTable &getStrings() { return Strings; }
const codeview::StringTable &getStrings() const { return Strings; }
codeview::DebugStringTableSubsection &getStrings() { return Strings; }
const codeview::DebugStringTableSubsection &getStrings() const {
return Strings;
}
private:
uint32_t calculateHashTableSize() const;
@ -51,7 +53,7 @@ private:
Error writeHashTable(BinaryStreamWriter &Writer) const;
Error writeEpilogue(BinaryStreamWriter &Writer) const;
codeview::StringTable Strings;
codeview::DebugStringTableSubsection Strings;
};
} // end namespace pdb

@ -19,6 +19,7 @@
#include "llvm/ADT/SmallVector.h"
#include "llvm/Support/SMLoc.h"
#include <cstdint>
#include <map>
namespace llvm {
@ -44,7 +45,7 @@ struct ConstantPoolEntry {
class ConstantPool {
using EntryVecTy = SmallVector<ConstantPoolEntry, 4>;
EntryVecTy Entries;
DenseMap<int64_t, const MCSymbolRefExpr *> CachedEntries;
std::map<int64_t, const MCSymbolRefExpr *> CachedEntries;
public:
// Initialize a new empty constant pool

@ -14,25 +14,22 @@
#ifndef LLVM_SUPPORT_MANAGEDSTATIC_H
#define LLVM_SUPPORT_MANAGEDSTATIC_H
#include "llvm/Support/Compiler.h"
#include <atomic>
#include <cstddef>
namespace llvm {
/// object_creator - Helper method for ManagedStatic.
template<class C>
LLVM_LIBRARY_VISIBILITY void* object_creator() {
return new C();
}
template <class C> struct object_creator {
static void *call() { return new C(); }
};
/// object_deleter - Helper method for ManagedStatic.
///
template <typename T> struct LLVM_LIBRARY_VISIBILITY object_deleter {
template <typename T> struct object_deleter {
static void call(void *Ptr) { delete (T *)Ptr; }
};
template <typename T, size_t N>
struct LLVM_LIBRARY_VISIBILITY object_deleter<T[N]> {
template <typename T, size_t N> struct object_deleter<T[N]> {
static void call(void *Ptr) { delete[](T *)Ptr; }
};
@ -59,14 +56,15 @@ public:
/// libraries that link in LLVM components) and for making destruction be
/// explicit through the llvm_shutdown() function call.
///
template<class C>
template <class C, class Creator = object_creator<C>,
class Deleter = object_deleter<C>>
class ManagedStatic : public ManagedStaticBase {
public:
// Accessors.
C &operator*() {
void *Tmp = Ptr.load(std::memory_order_acquire);
if (!Tmp)
RegisterManagedStatic(object_creator<C>, object_deleter<C>::call);
RegisterManagedStatic(Creator::call, Deleter::call);
return *static_cast<C *>(Ptr.load(std::memory_order_relaxed));
}
@ -76,7 +74,7 @@ public:
const C &operator*() const {
void *Tmp = Ptr.load(std::memory_order_acquire);
if (!Tmp)
RegisterManagedStatic(object_creator<C>, object_deleter<C>::call);
RegisterManagedStatic(Creator::call, Deleter::call);
return *static_cast<C *>(Ptr.load(std::memory_order_relaxed));
}

@ -1189,6 +1189,9 @@ public:
return Init ? Init->getValue() : StringRef();
}
ArrayRef<Init *> getArgs() const {
return makeArrayRef(getTrailingObjects<Init *>(), NumArgs);
}
ArrayRef<StringInit *> getArgNames() const {
return makeArrayRef(getTrailingObjects<StringInit *>(), NumArgNames);
}
@ -1200,19 +1203,16 @@ public:
typedef SmallVectorImpl<Init*>::const_iterator const_arg_iterator;
typedef SmallVectorImpl<StringInit*>::const_iterator const_name_iterator;
inline const_arg_iterator arg_begin() const { return getTrailingObjects<Init *>(); }
inline const_arg_iterator arg_end () const { return arg_begin() + NumArgs; }
inline iterator_range<const_arg_iterator> args() const {
return llvm::make_range(arg_begin(), arg_end());
}
inline const_arg_iterator arg_begin() const { return getArgs().begin(); }
inline const_arg_iterator arg_end () const { return getArgs().end(); }
inline size_t arg_size () const { return NumArgs; }
inline size_t arg_size () const { return NumArgs; }
inline bool arg_empty() const { return NumArgs == 0; }
inline const_name_iterator name_begin() const { return getTrailingObjects<StringInit *>(); }
inline const_name_iterator name_end () const { return name_begin() + NumArgNames; }
inline const_name_iterator name_begin() const { return getArgNames().begin();}
inline const_name_iterator name_end () const { return getArgNames().end(); }
inline size_t name_size () const { return NumArgNames; }
inline size_t name_size () const { return NumArgNames; }
inline bool name_empty() const { return NumArgNames == 0; }
Init *getBit(unsigned Bit) const override {

@ -58,10 +58,11 @@ class Expression {
private:
ExpressionType EType;
unsigned Opcode;
mutable hash_code HashVal;
public:
Expression(ExpressionType ET = ET_Base, unsigned O = ~2U)
: EType(ET), Opcode(O) {}
: EType(ET), Opcode(O), HashVal(0) {}
Expression(const Expression &) = delete;
Expression &operator=(const Expression &) = delete;
virtual ~Expression();
@ -82,6 +83,14 @@ public:
return equals(Other);
}
hash_code getComputedHash() const {
// It's theoretically possible for a thing to hash to zero. In that case,
// we will just compute the hash a few extra times, which is no worse that
// we did before, which was to compute it always.
if (static_cast<unsigned>(HashVal) == 0)
HashVal = getHashValue();
return HashVal;
}
virtual bool equals(const Expression &Other) const { return true; }

@ -2178,8 +2178,7 @@ StrengthenNoWrapFlags(ScalarEvolution *SE, SCEVTypes Type,
return Flags;
}
bool ScalarEvolution::isAvailableAtLoopEntry(const SCEV *S, const Loop *L,
DominatorTree &DT, LoopInfo &LI) {
bool ScalarEvolution::isAvailableAtLoopEntry(const SCEV *S, const Loop *L) {
if (!isLoopInvariant(S, L))
return false;
// If a value depends on a SCEVUnknown which is defined after the loop, we
@ -2516,7 +2515,7 @@ const SCEV *ScalarEvolution::getAddExpr(SmallVectorImpl<const SCEV *> &Ops,
const SCEVAddRecExpr *AddRec = cast<SCEVAddRecExpr>(Ops[Idx]);
const Loop *AddRecLoop = AddRec->getLoop();
for (unsigned i = 0, e = Ops.size(); i != e; ++i)
if (isAvailableAtLoopEntry(Ops[i], AddRecLoop, DT, LI)) {
if (isAvailableAtLoopEntry(Ops[i], AddRecLoop)) {
LIOps.push_back(Ops[i]);
Ops.erase(Ops.begin()+i);
--i; --e;
@ -2791,7 +2790,7 @@ const SCEV *ScalarEvolution::getMulExpr(SmallVectorImpl<const SCEV *> &Ops,
const SCEVAddRecExpr *AddRec = cast<SCEVAddRecExpr>(Ops[Idx]);
const Loop *AddRecLoop = AddRec->getLoop();
for (unsigned i = 0, e = Ops.size(); i != e; ++i)
if (isAvailableAtLoopEntry(Ops[i], AddRecLoop, DT, LI)) {
if (isAvailableAtLoopEntry(Ops[i], AddRecLoop)) {
LIOps.push_back(Ops[i]);
Ops.erase(Ops.begin()+i);
--i; --e;

@ -15,8 +15,8 @@
#include "llvm/ADT/TinyPtrVector.h"
#include "llvm/DebugInfo/CodeView/CVTypeVisitor.h"
#include "llvm/DebugInfo/CodeView/CodeView.h"
#include "llvm/DebugInfo/CodeView/DebugInlineeLinesSubsection.h"
#include "llvm/DebugInfo/CodeView/Line.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugInlineeLinesFragment.h"
#include "llvm/DebugInfo/CodeView/SymbolRecord.h"
#include "llvm/DebugInfo/CodeView/TypeDatabase.h"
#include "llvm/DebugInfo/CodeView/TypeDumpVisitor.h"
@ -393,7 +393,7 @@ void CodeViewDebug::endModule() {
// subprograms.
switchToDebugSectionForSymbol(nullptr);
MCSymbol *CompilerInfo = beginCVSubsection(ModuleDebugFragmentKind::Symbols);
MCSymbol *CompilerInfo = beginCVSubsection(DebugSubsectionKind::Symbols);
emitCompilerInformation();
endCVSubsection(CompilerInfo);
@ -417,7 +417,7 @@ void CodeViewDebug::endModule() {
// Emit UDT records for any types used by global variables.
if (!GlobalUDTs.empty()) {
MCSymbol *SymbolsEnd = beginCVSubsection(ModuleDebugFragmentKind::Symbols);
MCSymbol *SymbolsEnd = beginCVSubsection(DebugSubsectionKind::Symbols);
emitDebugInfoForUDTs(GlobalUDTs);
endCVSubsection(SymbolsEnd);
}
@ -630,8 +630,7 @@ void CodeViewDebug::emitInlineeLinesSubsection() {
return;
OS.AddComment("Inlinee lines subsection");
MCSymbol *InlineEnd =
beginCVSubsection(ModuleDebugFragmentKind::InlineeLines);
MCSymbol *InlineEnd = beginCVSubsection(DebugSubsectionKind::InlineeLines);
// We don't provide any extra file info.
// FIXME: Find out if debuggers use this info.
@ -756,7 +755,7 @@ void CodeViewDebug::emitDebugInfoForFunction(const Function *GV,
// Emit a symbol subsection, required by VS2012+ to find function boundaries.
OS.AddComment("Symbol subsection for " + Twine(FuncName));
MCSymbol *SymbolsEnd = beginCVSubsection(ModuleDebugFragmentKind::Symbols);
MCSymbol *SymbolsEnd = beginCVSubsection(DebugSubsectionKind::Symbols);
{
MCSymbol *ProcRecordBegin = MMI->getContext().createTempSymbol(),
*ProcRecordEnd = MMI->getContext().createTempSymbol();
@ -2111,7 +2110,7 @@ void CodeViewDebug::beginInstruction(const MachineInstr *MI) {
maybeRecordLocation(DL, Asm->MF);
}
MCSymbol *CodeViewDebug::beginCVSubsection(ModuleDebugFragmentKind Kind) {
MCSymbol *CodeViewDebug::beginCVSubsection(DebugSubsectionKind Kind) {
MCSymbol *BeginLabel = MMI->getContext().createTempSymbol(),
*EndLabel = MMI->getContext().createTempSymbol();
OS.EmitIntValue(unsigned(Kind), 4);
@ -2171,7 +2170,7 @@ void CodeViewDebug::emitDebugInfoForGlobals() {
if (!GV->hasComdat() && !GV->isDeclarationForLinker()) {
if (!EndLabel) {
OS.AddComment("Symbol subsection for globals");
EndLabel = beginCVSubsection(ModuleDebugFragmentKind::Symbols);
EndLabel = beginCVSubsection(DebugSubsectionKind::Symbols);
}
// FIXME: emitDebugInfoForGlobal() doesn't handle DIExpressions.
emitDebugInfoForGlobal(GVE->getVariable(), GV, Asm->getSymbol(GV));
@ -2189,7 +2188,7 @@ void CodeViewDebug::emitDebugInfoForGlobals() {
OS.AddComment("Symbol subsection for " +
Twine(GlobalValue::dropLLVMManglingEscape(GV->getName())));
switchToDebugSectionForSymbol(GVSym);
EndLabel = beginCVSubsection(ModuleDebugFragmentKind::Symbols);
EndLabel = beginCVSubsection(DebugSubsectionKind::Symbols);
// FIXME: emitDebugInfoForGlobal() doesn't handle DIExpressions.
emitDebugInfoForGlobal(GVE->getVariable(), GV, GVSym);
endCVSubsection(EndLabel);

@ -216,7 +216,7 @@ class LLVM_LIBRARY_VISIBILITY CodeViewDebug : public DebugHandlerBase {
/// Opens a subsection of the given kind in a .debug$S codeview section.
/// Returns an end label for use with endCVSubsection when the subsection is
/// finished.
MCSymbol *beginCVSubsection(codeview::ModuleDebugFragmentKind Kind);
MCSymbol *beginCVSubsection(codeview::DebugSubsectionKind Kind);
void endCVSubsection(MCSymbol *EndLabel);

@ -23,7 +23,7 @@ using namespace llvm;
char Localizer::ID = 0;
INITIALIZE_PASS(Localizer, DEBUG_TYPE,
"Move/duplicate certain instructions close to their use", false,
false);
false)
Localizer::Localizer() : MachineFunctionPass(ID) {
initializeLocalizerPass(*PassRegistry::getPassRegistry());

@ -14567,7 +14567,8 @@ static SDValue narrowExtractedVectorLoad(SDNode *Extract, SelectionDAG &DAG) {
// extract instead or remove that condition entirely.
auto *Ld = dyn_cast<LoadSDNode>(Extract->getOperand(0));
auto *ExtIdx = dyn_cast<ConstantSDNode>(Extract->getOperand(1));
if (!Ld || !Ld->hasOneUse() || Ld->isVolatile() || !ExtIdx)
if (!Ld || !Ld->hasOneUse() || Ld->getExtensionType() || Ld->isVolatile() ||
!ExtIdx)
return SDValue();
// The narrow load will be offset from the base address of the old load if

@ -925,10 +925,6 @@ getStrictFPOpcodeAction(const TargetLowering &TLI, unsigned Opcode, EVT VT) {
if (Action != TargetLowering::Legal)
Action = TargetLowering::Expand;
// ISD::FPOWI returns 'Legal' even though it should be expanded.
if (Opcode == ISD::STRICT_FPOWI && Action == TargetLowering::Legal)
Action = TargetLowering::Expand;
return Action;
}
@ -1027,7 +1023,6 @@ void SelectionDAGLegalize::LegalizeOp(SDNode *Node) {
break;
case ISD::EXTRACT_ELEMENT:
case ISD::FLT_ROUNDS_:
case ISD::FPOWI:
case ISD::MERGE_VALUES:
case ISD::EH_RETURN:
case ISD::FRAME_TO_ARGS_OFFSET:

@ -935,6 +935,7 @@ void TargetLoweringBase::initActions() {
// These library functions default to expand.
setOperationAction(ISD::FROUND, VT, Expand);
setOperationAction(ISD::FPOWI, VT, Expand);
// These operations default to expand for vector types.
if (VT.isVector()) {

@ -7,14 +7,16 @@ add_llvm_library(LLVMDebugInfoCodeView
Formatters.cpp
LazyRandomTypeCollection.cpp
Line.cpp
ModuleDebugFileChecksumFragment.cpp
ModuleDebugFragment.cpp
ModuleDebugFragmentRecord.cpp
ModuleDebugFragmentVisitor.cpp
ModuleDebugInlineeLinesFragment.cpp
ModuleDebugLineFragment.cpp
DebugChecksumsSubsection.cpp
DebugFrameDataSubsection.cpp
DebugInlineeLinesSubsection.cpp
DebugLinesSubsection.cpp
DebugStringTableSubsection.cpp
DebugSubsection.cpp
DebugSubsectionRecord.cpp
DebugSubsectionVisitor.cpp
DebugSymbolsSubsection.cpp
RecordSerialization.cpp
StringTable.cpp
SymbolRecordMapping.cpp
SymbolDumper.cpp
SymbolSerializer.cpp

@ -1,4 +1,4 @@
//===- ModuleDebugFileChecksumFragment.cpp ----------------------*- C++ -*-===//
//===- DebugChecksumsSubsection.cpp ----------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
@ -7,10 +7,10 @@
//
//===----------------------------------------------------------------------===//
#include "llvm/DebugInfo/CodeView/ModuleDebugFileChecksumFragment.h"
#include "llvm/DebugInfo/CodeView/DebugChecksumsSubsection.h"
#include "llvm/DebugInfo/CodeView/CodeViewError.h"
#include "llvm/DebugInfo/CodeView/StringTable.h"
#include "llvm/DebugInfo/CodeView/DebugStringTableSubsection.h"
#include "llvm/Support/BinaryStreamReader.h"
using namespace llvm;
@ -42,22 +42,24 @@ Error llvm::VarStreamArrayExtractor<FileChecksumEntry>::extract(
return Error::success();
}
Error ModuleDebugFileChecksumFragmentRef::initialize(
BinaryStreamReader Reader) {
Error DebugChecksumsSubsectionRef::initialize(BinaryStreamReader Reader) {
if (auto EC = Reader.readArray(Checksums, Reader.bytesRemaining()))
return EC;
return Error::success();
}
Error DebugChecksumsSubsectionRef::initialize(BinaryStreamRef Section) {
BinaryStreamReader Reader(Section);
return initialize(Reader);
}
ModuleDebugFileChecksumFragment::ModuleDebugFileChecksumFragment(
StringTable &Strings)
: ModuleDebugFragment(ModuleDebugFragmentKind::FileChecksums),
Strings(Strings) {}
DebugChecksumsSubsection::DebugChecksumsSubsection(
DebugStringTableSubsection &Strings)
: DebugSubsection(DebugSubsectionKind::FileChecksums), Strings(Strings) {}
void ModuleDebugFileChecksumFragment::addChecksum(StringRef FileName,
FileChecksumKind Kind,
ArrayRef<uint8_t> Bytes) {
void DebugChecksumsSubsection::addChecksum(StringRef FileName,
FileChecksumKind Kind,
ArrayRef<uint8_t> Bytes) {
FileChecksumEntry Entry;
if (!Bytes.empty()) {
uint8_t *Copy = Storage.Allocate<uint8_t>(Bytes.size());
@ -78,11 +80,11 @@ void ModuleDebugFileChecksumFragment::addChecksum(StringRef FileName,
SerializedSize += Len;
}
uint32_t ModuleDebugFileChecksumFragment::calculateSerializedLength() {
uint32_t DebugChecksumsSubsection::calculateSerializedSize() const {
return SerializedSize;
}
Error ModuleDebugFileChecksumFragment::commit(BinaryStreamWriter &Writer) {
Error DebugChecksumsSubsection::commit(BinaryStreamWriter &Writer) const {
for (const auto &FC : Checksums) {
FileChecksumEntryHeader Header;
Header.ChecksumKind = uint8_t(FC.Kind);
@ -98,8 +100,7 @@ Error ModuleDebugFileChecksumFragment::commit(BinaryStreamWriter &Writer) {
return Error::success();
}
uint32_t
ModuleDebugFileChecksumFragment::mapChecksumOffset(StringRef FileName) const {
uint32_t DebugChecksumsSubsection::mapChecksumOffset(StringRef FileName) const {
uint32_t Offset = Strings.getStringId(FileName);
auto Iter = OffsetMap.find(Offset);
assert(Iter != OffsetMap.end());

@ -0,0 +1,44 @@
//===- DebugFrameDataSubsection.cpp -----------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
#include "llvm/DebugInfo/CodeView/DebugFrameDataSubsection.h"
#include "llvm/DebugInfo/CodeView/CodeViewError.h"
using namespace llvm;
using namespace llvm::codeview;
Error DebugFrameDataSubsectionRef::initialize(BinaryStreamReader Reader) {
if (auto EC = Reader.readObject(RelocPtr))
return EC;
if (Reader.bytesRemaining() % sizeof(FrameData) != 0)
return make_error<CodeViewError>(cv_error_code::corrupt_record,
"Invalid frame data record format!");
uint32_t Count = Reader.bytesRemaining() / sizeof(FrameData);
if (auto EC = Reader.readArray(Frames, Count))
return EC;
return Error::success();
}
uint32_t DebugFrameDataSubsection::calculateSerializedSize() const {
return 4 + sizeof(FrameData) * Frames.size();
}
Error DebugFrameDataSubsection::commit(BinaryStreamWriter &Writer) const {
if (auto EC = Writer.writeInteger<uint32_t>(0))
return EC;
if (auto EC = Writer.writeArray(makeArrayRef(Frames)))
return EC;
return Error::success();
}
void DebugFrameDataSubsection::addFrameData(const FrameData &Frame) {
Frames.push_back(Frame);
}

@ -1,4 +1,4 @@
//===- ModuleDebugInlineeLineFragment.cpp ------------------------*- C++-*-===//
//===- DebugInlineeLinesSubsection.cpp ------------------------*- C++-*-===//
//
// The LLVM Compiler Infrastructure
//
@ -7,12 +7,12 @@
//
//===----------------------------------------------------------------------===//
#include "llvm/DebugInfo/CodeView/ModuleDebugInlineeLinesFragment.h"
#include "llvm/DebugInfo/CodeView/DebugInlineeLinesSubsection.h"
#include "llvm/DebugInfo/CodeView/CodeViewError.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugFileChecksumFragment.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugFragmentRecord.h"
#include "llvm/DebugInfo/CodeView/StringTable.h"
#include "llvm/DebugInfo/CodeView/DebugChecksumsSubsection.h"
#include "llvm/DebugInfo/CodeView/DebugStringTableSubsection.h"
#include "llvm/DebugInfo/CodeView/DebugSubsectionRecord.h"
using namespace llvm;
using namespace llvm::codeview;
@ -37,10 +37,10 @@ Error VarStreamArrayExtractor<InlineeSourceLine>::extract(
return Error::success();
}
ModuleDebugInlineeLineFragmentRef::ModuleDebugInlineeLineFragmentRef()
: ModuleDebugFragmentRef(ModuleDebugFragmentKind::InlineeLines) {}
DebugInlineeLinesSubsectionRef::DebugInlineeLinesSubsectionRef()
: DebugSubsectionRef(DebugSubsectionKind::InlineeLines) {}
Error ModuleDebugInlineeLineFragmentRef::initialize(BinaryStreamReader Reader) {
Error DebugInlineeLinesSubsectionRef::initialize(BinaryStreamReader Reader) {
if (auto EC = Reader.readEnum(Signature))
return EC;
@ -52,16 +52,16 @@ Error ModuleDebugInlineeLineFragmentRef::initialize(BinaryStreamReader Reader) {
return Error::success();
}
bool ModuleDebugInlineeLineFragmentRef::hasExtraFiles() const {
bool DebugInlineeLinesSubsectionRef::hasExtraFiles() const {
return Signature == InlineeLinesSignature::ExtraFiles;
}
ModuleDebugInlineeLineFragment::ModuleDebugInlineeLineFragment(
ModuleDebugFileChecksumFragment &Checksums, bool HasExtraFiles)
: ModuleDebugFragment(ModuleDebugFragmentKind::InlineeLines),
Checksums(Checksums), HasExtraFiles(HasExtraFiles) {}
DebugInlineeLinesSubsection::DebugInlineeLinesSubsection(
DebugChecksumsSubsection &Checksums, bool HasExtraFiles)
: DebugSubsection(DebugSubsectionKind::InlineeLines), Checksums(Checksums),
HasExtraFiles(HasExtraFiles) {}
uint32_t ModuleDebugInlineeLineFragment::calculateSerializedLength() {
uint32_t DebugInlineeLinesSubsection::calculateSerializedSize() const {
// 4 bytes for the signature
uint32_t Size = sizeof(InlineeLinesSignature);
@ -78,7 +78,7 @@ uint32_t ModuleDebugInlineeLineFragment::calculateSerializedLength() {
return Size;
}
Error ModuleDebugInlineeLineFragment::commit(BinaryStreamWriter &Writer) {
Error DebugInlineeLinesSubsection::commit(BinaryStreamWriter &Writer) const {
InlineeLinesSignature Sig = InlineeLinesSignature::Normal;
if (HasExtraFiles)
Sig = InlineeLinesSignature::ExtraFiles;
@ -102,7 +102,7 @@ Error ModuleDebugInlineeLineFragment::commit(BinaryStreamWriter &Writer) {
return Error::success();
}
void ModuleDebugInlineeLineFragment::addExtraFile(StringRef FileName) {
void DebugInlineeLinesSubsection::addExtraFile(StringRef FileName) {
uint32_t Offset = Checksums.mapChecksumOffset(FileName);
auto &Entry = Entries.back();
@ -110,9 +110,9 @@ void ModuleDebugInlineeLineFragment::addExtraFile(StringRef FileName) {
++ExtraFileCount;
}
void ModuleDebugInlineeLineFragment::addInlineSite(TypeIndex FuncId,
StringRef FileName,
uint32_t SourceLine) {
void DebugInlineeLinesSubsection::addInlineSite(TypeIndex FuncId,
StringRef FileName,
uint32_t SourceLine) {
uint32_t Offset = Checksums.mapChecksumOffset(FileName);
Entries.emplace_back();

@ -1,4 +1,4 @@
//===- ModuleDebugLineFragment.cpp -------------------------------*- C++-*-===//
//===- DebugLinesSubsection.cpp -------------------------------*- C++-*-===//
//
// The LLVM Compiler Infrastructure
//
@ -7,12 +7,12 @@
//
//===----------------------------------------------------------------------===//
#include "llvm/DebugInfo/CodeView/ModuleDebugLineFragment.h"
#include "llvm/DebugInfo/CodeView/DebugLinesSubsection.h"
#include "llvm/DebugInfo/CodeView/CodeViewError.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugFileChecksumFragment.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugFragmentRecord.h"
#include "llvm/DebugInfo/CodeView/StringTable.h"
#include "llvm/DebugInfo/CodeView/DebugChecksumsSubsection.h"
#include "llvm/DebugInfo/CodeView/DebugStringTableSubsection.h"
#include "llvm/DebugInfo/CodeView/DebugSubsectionRecord.h"
using namespace llvm;
using namespace llvm::codeview;
@ -49,10 +49,10 @@ Error LineColumnExtractor::extract(BinaryStreamRef Stream, uint32_t &Len,
return Error::success();
}
ModuleDebugLineFragmentRef::ModuleDebugLineFragmentRef()
: ModuleDebugFragmentRef(ModuleDebugFragmentKind::Lines) {}
DebugLinesSubsectionRef::DebugLinesSubsectionRef()
: DebugSubsectionRef(DebugSubsectionKind::Lines) {}
Error ModuleDebugLineFragmentRef::initialize(BinaryStreamReader Reader) {
Error DebugLinesSubsectionRef::initialize(BinaryStreamReader Reader) {
if (auto EC = Reader.readObject(Header))
return EC;
@ -63,23 +63,21 @@ Error ModuleDebugLineFragmentRef::initialize(BinaryStreamReader Reader) {
return Error::success();
}
bool ModuleDebugLineFragmentRef::hasColumnInfo() const {
bool DebugLinesSubsectionRef::hasColumnInfo() const {
return !!(Header->Flags & LF_HaveColumns);
}
ModuleDebugLineFragment::ModuleDebugLineFragment(
ModuleDebugFileChecksumFragment &Checksums, StringTable &Strings)
: ModuleDebugFragment(ModuleDebugFragmentKind::Lines),
Checksums(Checksums) {}
DebugLinesSubsection::DebugLinesSubsection(DebugChecksumsSubsection &Checksums,
DebugStringTableSubsection &Strings)
: DebugSubsection(DebugSubsectionKind::Lines), Checksums(Checksums) {}
void ModuleDebugLineFragment::createBlock(StringRef FileName) {
void DebugLinesSubsection::createBlock(StringRef FileName) {
uint32_t Offset = Checksums.mapChecksumOffset(FileName);
Blocks.emplace_back(Offset);
}
void ModuleDebugLineFragment::addLineInfo(uint32_t Offset,
const LineInfo &Line) {
void DebugLinesSubsection::addLineInfo(uint32_t Offset, const LineInfo &Line) {
Block &B = Blocks.back();
LineNumberEntry LNE;
LNE.Flags = Line.getRawData();
@ -87,10 +85,10 @@ void ModuleDebugLineFragment::addLineInfo(uint32_t Offset,
B.Lines.push_back(LNE);
}
void ModuleDebugLineFragment::addLineAndColumnInfo(uint32_t Offset,
const LineInfo &Line,
uint32_t ColStart,
uint32_t ColEnd) {
void DebugLinesSubsection::addLineAndColumnInfo(uint32_t Offset,
const LineInfo &Line,
uint32_t ColStart,
uint32_t ColEnd) {
Block &B = Blocks.back();
assert(B.Lines.size() == B.Columns.size());
@ -101,7 +99,7 @@ void ModuleDebugLineFragment::addLineAndColumnInfo(uint32_t Offset,
B.Columns.push_back(CNE);
}
Error ModuleDebugLineFragment::commit(BinaryStreamWriter &Writer) {
Error DebugLinesSubsection::commit(BinaryStreamWriter &Writer) const {
LineFragmentHeader Header;
Header.CodeSize = CodeSize;
Header.Flags = hasColumnInfo() ? LF_HaveColumns : 0;
@ -135,7 +133,7 @@ Error ModuleDebugLineFragment::commit(BinaryStreamWriter &Writer) {
return Error::success();
}
uint32_t ModuleDebugLineFragment::calculateSerializedLength() {
uint32_t DebugLinesSubsection::calculateSerializedSize() const {
uint32_t Size = sizeof(LineFragmentHeader);
for (const auto &B : Blocks) {
Size += sizeof(LineBlockFragmentHeader);
@ -146,16 +144,16 @@ uint32_t ModuleDebugLineFragment::calculateSerializedLength() {
return Size;
}
void ModuleDebugLineFragment::setRelocationAddress(uint16_t Segment,
uint16_t Offset) {
void DebugLinesSubsection::setRelocationAddress(uint16_t Segment,
uint16_t Offset) {
RelocOffset = Offset;
RelocSegment = Segment;
}
void ModuleDebugLineFragment::setCodeSize(uint32_t Size) { CodeSize = Size; }
void DebugLinesSubsection::setCodeSize(uint32_t Size) { CodeSize = Size; }
void ModuleDebugLineFragment::setFlags(LineFlags Flags) { this->Flags = Flags; }
void DebugLinesSubsection::setFlags(LineFlags Flags) { this->Flags = Flags; }
bool ModuleDebugLineFragment::hasColumnInfo() const {
bool DebugLinesSubsection::hasColumnInfo() const {
return Flags & LF_HaveColumns;
}

@ -1,4 +1,4 @@
//===- StringTable.cpp - CodeView String Table Reader/Writer ----*- C++ -*-===//
//===- DebugStringTableSubsection.cpp - CodeView String Table ---*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
@ -7,7 +7,7 @@
//
//===----------------------------------------------------------------------===//
#include "llvm/DebugInfo/CodeView/StringTable.h"
#include "llvm/DebugInfo/CodeView/DebugStringTableSubsection.h"
#include "llvm/Support/BinaryStream.h"
#include "llvm/Support/BinaryStreamReader.h"
@ -16,14 +16,16 @@
using namespace llvm;
using namespace llvm::codeview;
StringTableRef::StringTableRef() {}
DebugStringTableSubsectionRef::DebugStringTableSubsectionRef()
: DebugSubsectionRef(DebugSubsectionKind::StringTable) {}
Error StringTableRef::initialize(BinaryStreamRef Contents) {
Error DebugStringTableSubsectionRef::initialize(BinaryStreamRef Contents) {
Stream = Contents;
return Error::success();
}
Expected<StringRef> StringTableRef::getString(uint32_t Offset) const {
Expected<StringRef>
DebugStringTableSubsectionRef::getString(uint32_t Offset) const {
BinaryStreamReader Reader(Stream);
Reader.setOffset(Offset);
StringRef Result;
@ -32,7 +34,10 @@ Expected<StringRef> StringTableRef::getString(uint32_t Offset) const {
return Result;
}
uint32_t StringTable::insert(StringRef S) {
DebugStringTableSubsection::DebugStringTableSubsection()
: DebugSubsection(DebugSubsectionKind::StringTable) {}
uint32_t DebugStringTableSubsection::insert(StringRef S) {
auto P = Strings.insert({S, StringSize});
// If a given string didn't exist in the string table, we want to increment
@ -42,9 +47,11 @@ uint32_t StringTable::insert(StringRef S) {
return P.first->second;
}
uint32_t StringTable::calculateSerializedSize() const { return StringSize; }
uint32_t DebugStringTableSubsection::calculateSerializedSize() const {
return StringSize;
}
Error StringTable::commit(BinaryStreamWriter &Writer) const {
Error DebugStringTableSubsection::commit(BinaryStreamWriter &Writer) const {
assert(Writer.bytesRemaining() == StringSize);
uint32_t MaxOffset = 1;
@ -62,9 +69,9 @@ Error StringTable::commit(BinaryStreamWriter &Writer) const {
return Error::success();
}
uint32_t StringTable::size() const { return Strings.size(); }
uint32_t DebugStringTableSubsection::size() const { return Strings.size(); }
uint32_t StringTable::getStringId(StringRef S) const {
uint32_t DebugStringTableSubsection::getStringId(StringRef S) const {
auto P = Strings.find(S);
assert(P != Strings.end());
return P->second;

@ -1,4 +1,4 @@
//===- ModuleDebugFragment.cpp -----------------------------------*- C++-*-===//
//===- DebugSubsection.cpp -----------------------------------*- C++-*-===//
//
// The LLVM Compiler Infrastructure
//
@ -7,10 +7,10 @@
//
//===----------------------------------------------------------------------===//
#include "llvm/DebugInfo/CodeView/ModuleDebugFragment.h"
#include "llvm/DebugInfo/CodeView/DebugSubsection.h"
using namespace llvm::codeview;
ModuleDebugFragmentRef::~ModuleDebugFragmentRef() {}
DebugSubsectionRef::~DebugSubsectionRef() {}
ModuleDebugFragment::~ModuleDebugFragment() {}
DebugSubsection::~DebugSubsection() {}

@ -0,0 +1,81 @@
//===- DebugSubsectionRecord.cpp -----------------------------*- C++-*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
#include "llvm/DebugInfo/CodeView/DebugSubsectionRecord.h"
#include "llvm/DebugInfo/CodeView/DebugSubsection.h"
#include "llvm/Support/BinaryStreamReader.h"
using namespace llvm;
using namespace llvm::codeview;
DebugSubsectionRecord::DebugSubsectionRecord()
: Kind(DebugSubsectionKind::None) {}
DebugSubsectionRecord::DebugSubsectionRecord(DebugSubsectionKind Kind,
BinaryStreamRef Data)
: Kind(Kind), Data(Data) {}
Error DebugSubsectionRecord::initialize(BinaryStreamRef Stream,
DebugSubsectionRecord &Info) {
const DebugSubsectionHeader *Header;
BinaryStreamReader Reader(Stream);
if (auto EC = Reader.readObject(Header))
return EC;
DebugSubsectionKind Kind =
static_cast<DebugSubsectionKind>(uint32_t(Header->Kind));
switch (Kind) {
case DebugSubsectionKind::FileChecksums:
case DebugSubsectionKind::Lines:
case DebugSubsectionKind::InlineeLines:
break;
default:
llvm_unreachable("Unexpected debug fragment kind!");
}
if (auto EC = Reader.readStreamRef(Info.Data, Header->Length))
return EC;
Info.Kind = Kind;
return Error::success();
}
uint32_t DebugSubsectionRecord::getRecordLength() const {
uint32_t Result = sizeof(DebugSubsectionHeader) + Data.getLength();
assert(Result % 4 == 0);
return Result;
}
DebugSubsectionKind DebugSubsectionRecord::kind() const { return Kind; }
BinaryStreamRef DebugSubsectionRecord::getRecordData() const { return Data; }
DebugSubsectionRecordBuilder::DebugSubsectionRecordBuilder(
DebugSubsectionKind Kind, DebugSubsection &Frag)
: Kind(Kind), Frag(Frag) {}
uint32_t DebugSubsectionRecordBuilder::calculateSerializedLength() {
uint32_t Size = sizeof(DebugSubsectionHeader) +
alignTo(Frag.calculateSerializedSize(), 4);
return Size;
}
Error DebugSubsectionRecordBuilder::commit(BinaryStreamWriter &Writer) {
DebugSubsectionHeader Header;
Header.Kind = uint32_t(Kind);
Header.Length = calculateSerializedLength() - sizeof(DebugSubsectionHeader);
if (auto EC = Writer.writeObject(Header))
return EC;
if (auto EC = Frag.commit(Writer))
return EC;
if (auto EC = Writer.padToAlignment(4))
return EC;
return Error::success();
}

@ -0,0 +1,52 @@
//===- DebugSubsectionVisitor.cpp ---------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
#include "llvm/DebugInfo/CodeView/DebugSubsectionVisitor.h"
#include "llvm/DebugInfo/CodeView/DebugChecksumsSubsection.h"
#include "llvm/DebugInfo/CodeView/DebugInlineeLinesSubsection.h"
#include "llvm/DebugInfo/CodeView/DebugLinesSubsection.h"
#include "llvm/DebugInfo/CodeView/DebugSubsectionRecord.h"
#include "llvm/DebugInfo/CodeView/DebugUnknownSubsection.h"
#include "llvm/Support/BinaryStreamReader.h"
#include "llvm/Support/BinaryStreamRef.h"
using namespace llvm;
using namespace llvm::codeview;
Error llvm::codeview::visitDebugSubsection(const DebugSubsectionRecord &R,
DebugSubsectionVisitor &V) {
BinaryStreamReader Reader(R.getRecordData());
switch (R.kind()) {
case DebugSubsectionKind::Lines: {
DebugLinesSubsectionRef Fragment;
if (auto EC = Fragment.initialize(Reader))
return EC;
return V.visitLines(Fragment);
}
case DebugSubsectionKind::FileChecksums: {
DebugChecksumsSubsectionRef Fragment;
if (auto EC = Fragment.initialize(Reader))
return EC;
return V.visitFileChecksums(Fragment);
}
case DebugSubsectionKind::InlineeLines: {
DebugInlineeLinesSubsectionRef Fragment;
if (auto EC = Fragment.initialize(Reader))
return EC;
return V.visitInlineeLines(Fragment);
}
default: {
DebugUnknownSubsectionRef Fragment(R.kind(), R.getRecordData());
return V.visitUnknown(Fragment);
}
}
}

@ -0,0 +1,34 @@
//===- DebugSymbolsSubsection.cpp -------------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
#include "llvm/DebugInfo/CodeView/DebugSymbolsSubsection.h"
using namespace llvm;
using namespace llvm::codeview;
Error DebugSymbolsSubsectionRef::initialize(BinaryStreamReader Reader) {
return Reader.readArray(Records, Reader.getLength());
}
uint32_t DebugSymbolsSubsection::calculateSerializedSize() const {
return Length;
}
Error DebugSymbolsSubsection::commit(BinaryStreamWriter &Writer) const {
for (const auto &Record : Records) {
if (auto EC = Writer.writeBytes(Record.RecordData))
return EC;
}
return Error::success();
}
void DebugSymbolsSubsection::addSymbol(CVSymbol Symbol) {
Records.push_back(Symbol);
Length += Symbol.length();
}

@ -245,20 +245,20 @@ static const EnumEntry<uint32_t> FrameProcSymFlagNames[] = {
};
static const EnumEntry<uint32_t> ModuleSubstreamKindNames[] = {
CV_ENUM_CLASS_ENT(ModuleDebugFragmentKind, None),
CV_ENUM_CLASS_ENT(ModuleDebugFragmentKind, Symbols),
CV_ENUM_CLASS_ENT(ModuleDebugFragmentKind, Lines),
CV_ENUM_CLASS_ENT(ModuleDebugFragmentKind, StringTable),
CV_ENUM_CLASS_ENT(ModuleDebugFragmentKind, FileChecksums),
CV_ENUM_CLASS_ENT(ModuleDebugFragmentKind, FrameData),
CV_ENUM_CLASS_ENT(ModuleDebugFragmentKind, InlineeLines),
CV_ENUM_CLASS_ENT(ModuleDebugFragmentKind, CrossScopeImports),
CV_ENUM_CLASS_ENT(ModuleDebugFragmentKind, CrossScopeExports),
CV_ENUM_CLASS_ENT(ModuleDebugFragmentKind, ILLines),
CV_ENUM_CLASS_ENT(ModuleDebugFragmentKind, FuncMDTokenMap),
CV_ENUM_CLASS_ENT(ModuleDebugFragmentKind, TypeMDTokenMap),
CV_ENUM_CLASS_ENT(ModuleDebugFragmentKind, MergedAssemblyInput),
CV_ENUM_CLASS_ENT(ModuleDebugFragmentKind, CoffSymbolRVA),
CV_ENUM_CLASS_ENT(DebugSubsectionKind, None),
CV_ENUM_CLASS_ENT(DebugSubsectionKind, Symbols),
CV_ENUM_CLASS_ENT(DebugSubsectionKind, Lines),
CV_ENUM_CLASS_ENT(DebugSubsectionKind, StringTable),
CV_ENUM_CLASS_ENT(DebugSubsectionKind, FileChecksums),
CV_ENUM_CLASS_ENT(DebugSubsectionKind, FrameData),
CV_ENUM_CLASS_ENT(DebugSubsectionKind, InlineeLines),
CV_ENUM_CLASS_ENT(DebugSubsectionKind, CrossScopeImports),
CV_ENUM_CLASS_ENT(DebugSubsectionKind, CrossScopeExports),
CV_ENUM_CLASS_ENT(DebugSubsectionKind, ILLines),
CV_ENUM_CLASS_ENT(DebugSubsectionKind, FuncMDTokenMap),
CV_ENUM_CLASS_ENT(DebugSubsectionKind, TypeMDTokenMap),
CV_ENUM_CLASS_ENT(DebugSubsectionKind, MergedAssemblyInput),
CV_ENUM_CLASS_ENT(DebugSubsectionKind, CoffSymbolRVA),
};
static const EnumEntry<uint16_t> ExportSymFlagNames[] = {

@ -1,84 +0,0 @@
//===- ModuleDebugFragmentRecord.cpp -----------------------------*- C++-*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
#include "llvm/DebugInfo/CodeView/ModuleDebugFragmentRecord.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugFragment.h"
#include "llvm/Support/BinaryStreamReader.h"
using namespace llvm;
using namespace llvm::codeview;
ModuleDebugFragmentRecord::ModuleDebugFragmentRecord()
: Kind(ModuleDebugFragmentKind::None) {}
ModuleDebugFragmentRecord::ModuleDebugFragmentRecord(
ModuleDebugFragmentKind Kind, BinaryStreamRef Data)
: Kind(Kind), Data(Data) {}
Error ModuleDebugFragmentRecord::initialize(BinaryStreamRef Stream,
ModuleDebugFragmentRecord &Info) {
const ModuleDebugFragmentHeader *Header;
BinaryStreamReader Reader(Stream);
if (auto EC = Reader.readObject(Header))
return EC;
ModuleDebugFragmentKind Kind =
static_cast<ModuleDebugFragmentKind>(uint32_t(Header->Kind));
switch (Kind) {
case ModuleDebugFragmentKind::FileChecksums:
case ModuleDebugFragmentKind::Lines:
case ModuleDebugFragmentKind::InlineeLines:
break;
default:
llvm_unreachable("Unexpected debug fragment kind!");
}
if (auto EC = Reader.readStreamRef(Info.Data, Header->Length))
return EC;
Info.Kind = Kind;
return Error::success();
}
uint32_t ModuleDebugFragmentRecord::getRecordLength() const {
uint32_t Result = sizeof(ModuleDebugFragmentHeader) + Data.getLength();
assert(Result % 4 == 0);
return Result;
}
ModuleDebugFragmentKind ModuleDebugFragmentRecord::kind() const { return Kind; }
BinaryStreamRef ModuleDebugFragmentRecord::getRecordData() const {
return Data;
}
ModuleDebugFragmentRecordBuilder::ModuleDebugFragmentRecordBuilder(
ModuleDebugFragmentKind Kind, ModuleDebugFragment &Frag)
: Kind(Kind), Frag(Frag) {}
uint32_t ModuleDebugFragmentRecordBuilder::calculateSerializedLength() {
uint32_t Size = sizeof(ModuleDebugFragmentHeader) +
alignTo(Frag.calculateSerializedLength(), 4);
return Size;
}
Error ModuleDebugFragmentRecordBuilder::commit(BinaryStreamWriter &Writer) {
ModuleDebugFragmentHeader Header;
Header.Kind = uint32_t(Kind);
Header.Length =
calculateSerializedLength() - sizeof(ModuleDebugFragmentHeader);
if (auto EC = Writer.writeObject(Header))
return EC;
if (auto EC = Frag.commit(Writer))
return EC;
if (auto EC = Writer.padToAlignment(4))
return EC;
return Error::success();
}

@ -1,52 +0,0 @@
//===- ModuleDebugFragmentVisitor.cpp ---------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
#include "llvm/DebugInfo/CodeView/ModuleDebugFragmentVisitor.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugFileChecksumFragment.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugFragmentRecord.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugInlineeLinesFragment.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugLineFragment.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugUnknownFragment.h"
#include "llvm/Support/BinaryStreamReader.h"
#include "llvm/Support/BinaryStreamRef.h"
using namespace llvm;
using namespace llvm::codeview;
Error llvm::codeview::visitModuleDebugFragment(
const ModuleDebugFragmentRecord &R, ModuleDebugFragmentVisitor &V) {
BinaryStreamReader Reader(R.getRecordData());
switch (R.kind()) {
case ModuleDebugFragmentKind::Lines: {
ModuleDebugLineFragmentRef Fragment;
if (auto EC = Fragment.initialize(Reader))
return EC;
return V.visitLines(Fragment);
}
case ModuleDebugFragmentKind::FileChecksums: {
ModuleDebugFileChecksumFragmentRef Fragment;
if (auto EC = Fragment.initialize(Reader))
return EC;
return V.visitFileChecksums(Fragment);
}
case ModuleDebugFragmentKind::InlineeLines: {
ModuleDebugInlineeLineFragmentRef Fragment;
if (auto EC = Fragment.initialize(Reader))
return EC;
return V.visitInlineeLines(Fragment);
}
default: {
ModuleDebugUnknownFragmentRef Fragment(R.kind(), R.getRecordData());
return V.visitUnknown(Fragment);
}
}
}

@ -11,8 +11,8 @@
#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/DebugInfo/CodeView/CVSymbolVisitor.h"
#include "llvm/DebugInfo/CodeView/DebugStringTableSubsection.h"
#include "llvm/DebugInfo/CodeView/EnumTables.h"
#include "llvm/DebugInfo/CodeView/StringTable.h"
#include "llvm/DebugInfo/CodeView/SymbolDeserializer.h"
#include "llvm/DebugInfo/CodeView/SymbolDumpDelegate.h"
#include "llvm/DebugInfo/CodeView/SymbolRecord.h"
@ -369,7 +369,7 @@ Error CVSymbolDumperImpl::visitKnownRecord(
DictScope S(W, "DefRangeSubfield");
if (ObjDelegate) {
StringTableRef Strings = ObjDelegate->getStringTable();
DebugStringTableSubsectionRef Strings = ObjDelegate->getStringTable();
auto ExpectedProgram = Strings.getString(DefRangeSubfield.Program);
if (!ExpectedProgram) {
consumeError(ExpectedProgram.takeError());
@ -390,7 +390,7 @@ Error CVSymbolDumperImpl::visitKnownRecord(CVSymbol &CVR,
DictScope S(W, "DefRange");
if (ObjDelegate) {
StringTableRef Strings = ObjDelegate->getStringTable();
DebugStringTableSubsectionRef Strings = ObjDelegate->getStringTable();
auto ExpectedProgram = Strings.getString(DefRange.Program);
if (!ExpectedProgram) {
consumeError(ExpectedProgram.takeError());

@ -10,7 +10,7 @@
#include "llvm/DebugInfo/PDB/Native/DbiModuleDescriptorBuilder.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/DebugInfo/CodeView/ModuleDebugFragmentRecord.h"
#include "llvm/DebugInfo/CodeView/DebugSubsectionRecord.h"
#include "llvm/DebugInfo/MSF/MSFBuilder.h"
#include "llvm/DebugInfo/MSF/MSFCommon.h"
#include "llvm/DebugInfo/MSF/MappedBlockStream.h"
@ -170,8 +170,8 @@ Error DbiModuleDescriptorBuilder::commit(BinaryStreamWriter &ModiWriter,
}
void DbiModuleDescriptorBuilder::addC13Fragment(
std::unique_ptr<ModuleDebugLineFragment> Lines) {
ModuleDebugLineFragment &Frag = *Lines;
std::unique_ptr<DebugLinesSubsection> Lines) {
DebugLinesSubsection &Frag = *Lines;
// File Checksums have to come first, so push an empty entry on if this
// is the first.
@ -180,12 +180,12 @@ void DbiModuleDescriptorBuilder::addC13Fragment(
this->LineInfo.push_back(std::move(Lines));
C13Builders.push_back(
llvm::make_unique<ModuleDebugFragmentRecordBuilder>(Frag.kind(), Frag));
llvm::make_unique<DebugSubsectionRecordBuilder>(Frag.kind(), Frag));
}
void DbiModuleDescriptorBuilder::addC13Fragment(
std::unique_ptr<codeview::ModuleDebugInlineeLineFragment> Inlinees) {
ModuleDebugInlineeLineFragment &Frag = *Inlinees;
std::unique_ptr<codeview::DebugInlineeLinesSubsection> Inlinees) {
DebugInlineeLinesSubsection &Frag = *Inlinees;
// File Checksums have to come first, so push an empty entry on if this
// is the first.
@ -194,17 +194,17 @@ void DbiModuleDescriptorBuilder::addC13Fragment(
this->Inlinees.push_back(std::move(Inlinees));
C13Builders.push_back(
llvm::make_unique<ModuleDebugFragmentRecordBuilder>(Frag.kind(), Frag));
llvm::make_unique<DebugSubsectionRecordBuilder>(Frag.kind(), Frag));
}
void DbiModuleDescriptorBuilder::setC13FileChecksums(
std::unique_ptr<ModuleDebugFileChecksumFragment> Checksums) {
std::unique_ptr<DebugChecksumsSubsection> Checksums) {
assert(!ChecksumInfo && "Can't have more than one checksum info!");
if (C13Builders.empty())
C13Builders.push_back(nullptr);
ChecksumInfo = std::move(Checksums);
C13Builders[0] = llvm::make_unique<ModuleDebugFragmentRecordBuilder>(
C13Builders[0] = llvm::make_unique<DebugSubsectionRecordBuilder>(
ChecksumInfo->kind(), *ChecksumInfo);
}

@ -145,7 +145,7 @@ void CodeViewContext::emitStringTable(MCObjectStreamer &OS) {
MCSymbol *StringBegin = Ctx.createTempSymbol("strtab_begin", false),
*StringEnd = Ctx.createTempSymbol("strtab_end", false);
OS.EmitIntValue(unsigned(ModuleDebugFragmentKind::StringTable), 4);
OS.EmitIntValue(unsigned(DebugSubsectionKind::StringTable), 4);
OS.emitAbsoluteSymbolDiff(StringEnd, StringBegin, 4);
OS.EmitLabel(StringBegin);
@ -172,7 +172,7 @@ void CodeViewContext::emitFileChecksums(MCObjectStreamer &OS) {
MCSymbol *FileBegin = Ctx.createTempSymbol("filechecksums_begin", false),
*FileEnd = Ctx.createTempSymbol("filechecksums_end", false);
OS.EmitIntValue(unsigned(ModuleDebugFragmentKind::FileChecksums), 4);
OS.EmitIntValue(unsigned(DebugSubsectionKind::FileChecksums), 4);
OS.emitAbsoluteSymbolDiff(FileEnd, FileBegin, 4);
OS.EmitLabel(FileBegin);
@ -197,7 +197,7 @@ void CodeViewContext::emitLineTableForFunction(MCObjectStreamer &OS,
MCSymbol *LineBegin = Ctx.createTempSymbol("linetable_begin", false),
*LineEnd = Ctx.createTempSymbol("linetable_end", false);
OS.EmitIntValue(unsigned(ModuleDebugFragmentKind::Lines), 4);
OS.EmitIntValue(unsigned(DebugSubsectionKind::Lines), 4);
OS.emitAbsoluteSymbolDiff(LineEnd, LineBegin, 4);
OS.EmitLabel(LineBegin);
OS.EmitCOFFSecRel32(FuncBegin, /*Offset=*/0);

@ -1559,11 +1559,13 @@ IEEEFloat::opStatus IEEEFloat::divideSpecials(const IEEEFloat &rhs) {
case PackCategoriesIntoKey(fcInfinity, fcNaN):
category = fcNaN;
copySignificand(rhs);
LLVM_FALLTHROUGH;
case PackCategoriesIntoKey(fcNaN, fcZero):
case PackCategoriesIntoKey(fcNaN, fcNormal):
case PackCategoriesIntoKey(fcNaN, fcInfinity):
case PackCategoriesIntoKey(fcNaN, fcNaN):
sign = false;
LLVM_FALLTHROUGH;
case PackCategoriesIntoKey(fcInfinity, fcZero):
case PackCategoriesIntoKey(fcInfinity, fcNormal):
case PackCategoriesIntoKey(fcZero, fcInfinity):

@ -72,10 +72,15 @@ std::unique_ptr<raw_fd_ostream> llvm::CreateInfoOutputFile() {
return llvm::make_unique<raw_fd_ostream>(2, false); // stderr.
}
static TimerGroup *getDefaultTimerGroup() {
static TimerGroup DefaultTimerGroup("misc", "Miscellaneous Ungrouped Timers");
return &DefaultTimerGroup;
}
namespace {
struct CreateDefaultTimerGroup {
static void *call() {
return new TimerGroup("misc", "Miscellaneous Ungrouped Timers");
}
};
} // namespace
static ManagedStatic<TimerGroup, CreateDefaultTimerGroup> DefaultTimerGroup;
static TimerGroup *getDefaultTimerGroup() { return &*DefaultTimerGroup; }
//===----------------------------------------------------------------------===//
// Timer Implementation

@ -405,27 +405,21 @@ IntInit::convertInitializerBitRange(ArrayRef<unsigned> Bits) const {
}
CodeInit *CodeInit::get(StringRef V) {
static DenseMap<StringRef, CodeInit*> ThePool;
static StringMap<CodeInit*, BumpPtrAllocator &> ThePool(Allocator);
auto I = ThePool.insert(std::make_pair(V, nullptr));
if (I.second) {
StringRef VCopy = V.copy(Allocator);
I.first->first = VCopy;
I.first->second = new(Allocator) CodeInit(VCopy);
}
return I.first->second;
auto &Entry = *ThePool.insert(std::make_pair(V, nullptr)).first;
if (!Entry.second)
Entry.second = new(Allocator) CodeInit(Entry.getKey());
return Entry.second;
}
StringInit *StringInit::get(StringRef V) {
static DenseMap<StringRef, StringInit*> ThePool;
static StringMap<StringInit*, BumpPtrAllocator &> ThePool(Allocator);
auto I = ThePool.insert(std::make_pair(V, nullptr));
if (I.second) {
StringRef VCopy = V.copy(Allocator);
I.first->first = VCopy;
I.first->second = new(Allocator) StringInit(VCopy);
}
return I.first->second;
auto &Entry = *ThePool.insert(std::make_pair(V, nullptr)).first;
if (!Entry.second)
Entry.second = new(Allocator) StringInit(Entry.getKey());
return Entry.second;
}
Init *StringInit::convertInitializerTo(RecTy *Ty) const {
@ -1540,7 +1534,7 @@ Init *DagInit::resolveReferences(Record &R, const RecordVal *RV) const {
SmallVector<Init*, 8> NewArgs;
NewArgs.reserve(arg_size());
bool ArgsChanged = false;
for (const Init *Arg : args()) {
for (const Init *Arg : getArgs()) {
Init *NewArg = Arg->resolveReferences(R, RV);
NewArgs.push_back(NewArg);
ArgsChanged |= NewArg != Arg;

@ -137,6 +137,34 @@ static cl::opt<bool> EnableRedZone("aarch64-redzone",
STATISTIC(NumRedZoneFunctions, "Number of functions using red zone");
/// Look at each instruction that references stack frames and return the stack
/// size limit beyond which some of these instructions will require a scratch
/// register during their expansion later.
static unsigned estimateRSStackSizeLimit(MachineFunction &MF) {
// FIXME: For now, just conservatively guestimate based on unscaled indexing
// range. We'll end up allocating an unnecessary spill slot a lot, but
// realistically that's not a big deal at this stage of the game.
for (MachineBasicBlock &MBB : MF) {
for (MachineInstr &MI : MBB) {
if (MI.isDebugValue() || MI.isPseudo() ||
MI.getOpcode() == AArch64::ADDXri ||
MI.getOpcode() == AArch64::ADDSXri)
continue;
for (unsigned i = 0, e = MI.getNumOperands(); i != e; ++i) {
if (!MI.getOperand(i).isFI())
continue;
int Offset = 0;
if (isAArch64FrameOffsetLegal(MI, Offset, nullptr, nullptr, nullptr) ==
AArch64FrameOffsetCannotUpdate)
return 0;
}
}
}
return 255;
}
bool AArch64FrameLowering::canUseRedZone(const MachineFunction &MF) const {
if (!EnableRedZone)
return false;
@ -1169,16 +1197,13 @@ void AArch64FrameLowering::determineCalleeSaves(MachineFunction &MF,
unsigned NumRegsSpilled = SavedRegs.count();
bool CanEliminateFrame = NumRegsSpilled == 0;
// FIXME: Set BigStack if any stack slot references may be out of range.
// For now, just conservatively guestimate based on unscaled indexing
// range. We'll end up allocating an unnecessary spill slot a lot, but
// realistically that's not a big deal at this stage of the game.
// The CSR spill slots have not been allocated yet, so estimateStackSize
// won't include them.
MachineFrameInfo &MFI = MF.getFrameInfo();
unsigned CFSize = MFI.estimateStackSize(MF) + 8 * NumRegsSpilled;
DEBUG(dbgs() << "Estimated stack frame size: " << CFSize << " bytes.\n");
bool BigStack = (CFSize >= 256);
unsigned EstimatedStackSizeLimit = estimateRSStackSizeLimit(MF);
bool BigStack = (CFSize > EstimatedStackSizeLimit);
if (BigStack || !CanEliminateFrame || RegInfo->cannotEliminateFrame(MF))
AFI->setHasStackFrame(true);

@ -381,7 +381,6 @@ AArch64TargetLowering::AArch64TargetLowering(const TargetMachine &TM,
setOperationAction(ISD::FNEARBYINT, MVT::v4f16, Expand);
setOperationAction(ISD::FNEG, MVT::v4f16, Expand);
setOperationAction(ISD::FPOW, MVT::v4f16, Expand);
setOperationAction(ISD::FPOWI, MVT::v4f16, Expand);
setOperationAction(ISD::FREM, MVT::v4f16, Expand);
setOperationAction(ISD::FROUND, MVT::v4f16, Expand);
setOperationAction(ISD::FRINT, MVT::v4f16, Expand);
@ -413,7 +412,6 @@ AArch64TargetLowering::AArch64TargetLowering(const TargetMachine &TM,
setOperationAction(ISD::FNEARBYINT, MVT::v8f16, Expand);
setOperationAction(ISD::FNEG, MVT::v8f16, Expand);
setOperationAction(ISD::FPOW, MVT::v8f16, Expand);
setOperationAction(ISD::FPOWI, MVT::v8f16, Expand);
setOperationAction(ISD::FREM, MVT::v8f16, Expand);
setOperationAction(ISD::FROUND, MVT::v8f16, Expand);
setOperationAction(ISD::FRINT, MVT::v8f16, Expand);
@ -726,7 +724,6 @@ void AArch64TargetLowering::addTypeForNEON(MVT VT, MVT PromotedBitwiseVT) {
if (VT == MVT::v2f32 || VT == MVT::v4f32 || VT == MVT::v2f64) {
setOperationAction(ISD::FSIN, VT, Expand);
setOperationAction(ISD::FCOS, VT, Expand);
setOperationAction(ISD::FPOWI, VT, Expand);
setOperationAction(ISD::FPOW, VT, Expand);
setOperationAction(ISD::FLOG, VT, Expand);
setOperationAction(ISD::FLOG2, VT, Expand);

@ -730,7 +730,7 @@ public:
/// \returns True if waitcnt instruction is needed before barrier instruction,
/// false otherwise.
bool needWaitcntBeforeBarrier() const {
return getGeneration() < GFX9;
return true;
}
/// \returns true if the flat_scratch register should be initialized with the

@ -736,6 +736,9 @@ void GCNPassConfig::addMachineSSAOptimization() {
addPass(createSIShrinkInstructionsPass());
if (EnableSDWAPeephole) {
addPass(&SIPeepholeSDWAID);
addPass(&MachineLICMID);
addPass(&MachineCSEID);
addPass(&SIFoldOperandsID);
addPass(&DeadMachineInstructionElimID);
}
}

@ -247,9 +247,10 @@ static bool tryAddToFoldList(SmallVectorImpl<FoldCandidate> &FoldList,
// If the use operand doesn't care about the value, this may be an operand only
// used for register indexing, in which case it is unsafe to fold.
static bool isUseSafeToFold(const MachineInstr &MI,
static bool isUseSafeToFold(const SIInstrInfo *TII,
const MachineInstr &MI,
const MachineOperand &UseMO) {
return !UseMO.isUndef();
return !UseMO.isUndef() && !TII->isSDWA(MI);
//return !MI.hasRegisterImplicitUseOperand(UseMO.getReg());
}
@ -261,7 +262,7 @@ void SIFoldOperands::foldOperand(
SmallVectorImpl<MachineInstr *> &CopiesToReplace) const {
const MachineOperand &UseOp = UseMI->getOperand(UseOpIdx);
if (!isUseSafeToFold(*UseMI, UseOp))
if (!isUseSafeToFold(TII, *UseMI, UseOp))
return;
// FIXME: Fold operands with subregs.

@ -55,6 +55,7 @@ private:
std::unordered_map<MachineInstr *, std::unique_ptr<SDWAOperand>> SDWAOperands;
std::unordered_map<MachineInstr *, SDWAOperandsVector> PotentialMatches;
SmallVector<MachineInstr *, 8> ConvertedInstructions;
Optional<int64_t> foldToImm(const MachineOperand &Op) const;
@ -69,6 +70,7 @@ public:
void matchSDWAOperands(MachineFunction &MF);
bool isConvertibleToSDWA(const MachineInstr &MI) const;
bool convertToSDWA(MachineInstr &MI, const SDWAOperandsVector &SDWAOperands);
void legalizeScalarOperands(MachineInstr &MI) const;
StringRef getPassName() const override { return "SI Peephole SDWA"; }
@ -289,7 +291,7 @@ bool SDWASrcOperand::convertToSDWA(MachineInstr &MI, const SIInstrInfo *TII) {
MachineOperand *SrcSel = TII->getNamedOperand(MI, AMDGPU::OpName::src0_sel);
MachineOperand *SrcMods =
TII->getNamedOperand(MI, AMDGPU::OpName::src0_modifiers);
assert(Src && Src->isReg());
assert(Src && (Src->isReg() || Src->isImm()));
if (!isSameReg(*Src, *getReplacedOperand())) {
// If this is not src0 then it should be src1
Src = TII->getNamedOperand(MI, AMDGPU::OpName::src1);
@ -580,18 +582,8 @@ void SIPeepholeSDWA::matchSDWAOperands(MachineFunction &MF) {
}
bool SIPeepholeSDWA::isConvertibleToSDWA(const MachineInstr &MI) const {
// Check if this instruction can be converted to SDWA:
// 1. Does this opcode support SDWA
if (AMDGPU::getSDWAOp(MI.getOpcode()) == -1)
return false;
// 2. Are all operands - VGPRs
for (const MachineOperand &Operand : MI.explicit_operands()) {
if (!Operand.isReg() || !TRI->isVGPR(*MRI, Operand.getReg()))
return false;
}
return true;
// Check if this instruction has opcode that supports SDWA
return AMDGPU::getSDWAOp(MI.getOpcode()) != -1;
}
bool SIPeepholeSDWA::convertToSDWA(MachineInstr &MI,
@ -685,7 +677,9 @@ bool SIPeepholeSDWA::convertToSDWA(MachineInstr &MI,
if (PotentialMatches.count(Operand->getParentInst()) == 0)
Converted |= Operand->convertToSDWA(*SDWAInst, TII);
}
if (!Converted) {
if (Converted) {
ConvertedInstructions.push_back(SDWAInst);
} else {
SDWAInst->eraseFromParent();
return false;
}
@ -698,6 +692,29 @@ bool SIPeepholeSDWA::convertToSDWA(MachineInstr &MI,
return true;
}
// If an instruction was converted to SDWA it should not have immediates or SGPR
// operands. Copy its scalar operands into VGPRs.
void SIPeepholeSDWA::legalizeScalarOperands(MachineInstr &MI) const {
const MCInstrDesc &Desc = TII->get(MI.getOpcode());
for (unsigned I = 0, E = MI.getNumExplicitOperands(); I != E; ++I) {
MachineOperand &Op = MI.getOperand(I);
if (!Op.isImm() && !(Op.isReg() && !TRI->isVGPR(*MRI, Op.getReg())))
continue;
if (Desc.OpInfo[I].RegClass == -1 ||
!TRI->hasVGPRs(TRI->getRegClass(Desc.OpInfo[I].RegClass)))
continue;
unsigned VGPR = MRI->createVirtualRegister(&AMDGPU::VGPR_32RegClass);
auto Copy = BuildMI(*MI.getParent(), MI.getIterator(), MI.getDebugLoc(),
TII->get(AMDGPU::V_MOV_B32_e32), VGPR);
if (Op.isImm())
Copy.addImm(Op.getImm());
else if (Op.isReg())
Copy.addReg(Op.getReg(), Op.isKill() ? RegState::Kill : 0,
Op.getSubReg());
Op.ChangeToRegister(VGPR, false);
}
}
bool SIPeepholeSDWA::runOnMachineFunction(MachineFunction &MF) {
const SISubtarget &ST = MF.getSubtarget<SISubtarget>();
@ -728,5 +745,9 @@ bool SIPeepholeSDWA::runOnMachineFunction(MachineFunction &MF) {
PotentialMatches.clear();
SDWAOperands.clear();
while (!ConvertedInstructions.empty())
legalizeScalarOperands(*ConvertedInstructions.pop_back_val());
return false;
}

@ -585,7 +585,6 @@ ARMTargetLowering::ARMTargetLowering(const TargetMachine &TM,
setOperationAction(ISD::FSQRT, MVT::v2f64, Expand);
setOperationAction(ISD::FSIN, MVT::v2f64, Expand);
setOperationAction(ISD::FCOS, MVT::v2f64, Expand);
setOperationAction(ISD::FPOWI, MVT::v2f64, Expand);
setOperationAction(ISD::FPOW, MVT::v2f64, Expand);
setOperationAction(ISD::FLOG, MVT::v2f64, Expand);
setOperationAction(ISD::FLOG2, MVT::v2f64, Expand);
@ -603,7 +602,6 @@ ARMTargetLowering::ARMTargetLowering(const TargetMachine &TM,
setOperationAction(ISD::FSQRT, MVT::v4f32, Expand);
setOperationAction(ISD::FSIN, MVT::v4f32, Expand);
setOperationAction(ISD::FCOS, MVT::v4f32, Expand);
setOperationAction(ISD::FPOWI, MVT::v4f32, Expand);
setOperationAction(ISD::FPOW, MVT::v4f32, Expand);
setOperationAction(ISD::FLOG, MVT::v4f32, Expand);
setOperationAction(ISD::FLOG2, MVT::v4f32, Expand);
@ -620,7 +618,6 @@ ARMTargetLowering::ARMTargetLowering(const TargetMachine &TM,
setOperationAction(ISD::FSQRT, MVT::v2f32, Expand);
setOperationAction(ISD::FSIN, MVT::v2f32, Expand);
setOperationAction(ISD::FCOS, MVT::v2f32, Expand);
setOperationAction(ISD::FPOWI, MVT::v2f32, Expand);
setOperationAction(ISD::FPOW, MVT::v2f32, Expand);
setOperationAction(ISD::FLOG, MVT::v2f32, Expand);
setOperationAction(ISD::FLOG2, MVT::v2f32, Expand);
@ -743,7 +740,6 @@ ARMTargetLowering::ARMTargetLowering(const TargetMachine &TM,
setOperationAction(ISD::FSQRT, MVT::f64, Expand);
setOperationAction(ISD::FSIN, MVT::f64, Expand);
setOperationAction(ISD::FCOS, MVT::f64, Expand);
setOperationAction(ISD::FPOWI, MVT::f64, Expand);
setOperationAction(ISD::FPOW, MVT::f64, Expand);
setOperationAction(ISD::FLOG, MVT::f64, Expand);
setOperationAction(ISD::FLOG2, MVT::f64, Expand);

@ -2003,7 +2003,7 @@ HexagonTargetLowering::HexagonTargetLowering(const TargetMachine &TM,
// Floating point arithmetic/math functions:
ISD::FADD, ISD::FSUB, ISD::FMUL, ISD::FMA, ISD::FDIV,
ISD::FREM, ISD::FNEG, ISD::FABS, ISD::FSQRT, ISD::FSIN,
ISD::FCOS, ISD::FPOWI, ISD::FPOW, ISD::FLOG, ISD::FLOG2,
ISD::FCOS, ISD::FPOW, ISD::FLOG, ISD::FLOG2,
ISD::FLOG10, ISD::FEXP, ISD::FEXP2, ISD::FCEIL, ISD::FTRUNC,
ISD::FRINT, ISD::FNEARBYINT, ISD::FROUND, ISD::FFLOOR,
ISD::FMINNUM, ISD::FMAXNUM, ISD::FSINCOS,

@ -13,6 +13,7 @@
#include "MCTargetDesc/MipsMCTargetDesc.h"
#include "MipsTargetStreamer.h"
#include "MCTargetDesc/MipsBaseInfo.h"
#include "llvm/ADT/APFloat.h"
#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringSwitch.h"
@ -216,9 +217,15 @@ class MipsAsmParser : public MCTargetAsmParser {
unsigned SrcReg, bool Is32BitSym, SMLoc IDLoc,
MCStreamer &Out, const MCSubtargetInfo *STI);
bool emitPartialAddress(MipsTargetStreamer &TOut, SMLoc IDLoc, MCSymbol *Sym);
bool expandLoadImm(MCInst &Inst, bool Is32BitImm, SMLoc IDLoc,
MCStreamer &Out, const MCSubtargetInfo *STI);
bool expandLoadImmReal(MCInst &Inst, bool IsSingle, bool IsGPR, bool Is64FPU,
SMLoc IDLoc, MCStreamer &Out,
const MCSubtargetInfo *STI);
bool expandLoadAddress(unsigned DstReg, unsigned BaseReg,
const MCOperand &Offset, bool Is32BitAddress,
SMLoc IDLoc, MCStreamer &Out,
@ -1011,6 +1018,16 @@ public:
Inst.addOperand(MCOperand::createReg(getAFGR64Reg()));
}
void addStrictlyAFGR64AsmRegOperands(MCInst &Inst, unsigned N) const {
assert(N == 1 && "Invalid number of operands!");
Inst.addOperand(MCOperand::createReg(getAFGR64Reg()));
}
void addStrictlyFGR64AsmRegOperands(MCInst &Inst, unsigned N) const {
assert(N == 1 && "Invalid number of operands!");
Inst.addOperand(MCOperand::createReg(getFGR64Reg()));
}
void addFGR64AsmRegOperands(MCInst &Inst, unsigned N) const {
assert(N == 1 && "Invalid number of operands!");
Inst.addOperand(MCOperand::createReg(getFGR64Reg()));
@ -1027,6 +1044,15 @@ public:
"registers");
}
void addStrictlyFGR32AsmRegOperands(MCInst &Inst, unsigned N) const {
assert(N == 1 && "Invalid number of operands!");
Inst.addOperand(MCOperand::createReg(getFGR32Reg()));
// FIXME: We ought to do this for -integrated-as without -via-file-asm too.
if (!AsmParser.useOddSPReg() && RegIdx.Index & 1)
AsmParser.Error(StartLoc, "-mno-odd-spreg prohibits the use of odd FPU "
"registers");
}
void addFGRH32AsmRegOperands(MCInst &Inst, unsigned N) const {
assert(N == 1 && "Invalid number of operands!");
Inst.addOperand(MCOperand::createReg(getFGRH32Reg()));
@ -1574,6 +1600,11 @@ public:
return isRegIdx() && RegIdx.Kind & RegKind_FGR && RegIdx.Index <= 31;
}
bool isStrictlyFGRAsmReg() const {
// AFGR64 is $0-$15 but we handle this in getAFGR64()
return isRegIdx() && RegIdx.Kind == RegKind_FGR && RegIdx.Index <= 31;
}
bool isHWRegsAsmReg() const {
return isRegIdx() && RegIdx.Kind & RegKind_HWRegs && RegIdx.Index <= 31;
}
@ -2368,6 +2399,27 @@ MipsAsmParser::tryExpandInstruction(MCInst &Inst, SMLoc IDLoc, MCStreamer &Out,
case Mips::PseudoTRUNC_W_D:
return expandTrunc(Inst, true, true, IDLoc, Out, STI) ? MER_Fail
: MER_Success;
case Mips::LoadImmSingleGPR:
return expandLoadImmReal(Inst, true, true, false, IDLoc, Out, STI)
? MER_Fail
: MER_Success;
case Mips::LoadImmSingleFGR:
return expandLoadImmReal(Inst, true, false, false, IDLoc, Out, STI)
? MER_Fail
: MER_Success;
case Mips::LoadImmDoubleGPR:
return expandLoadImmReal(Inst, false, true, false, IDLoc, Out, STI)
? MER_Fail
: MER_Success;
case Mips::LoadImmDoubleFGR:
return expandLoadImmReal(Inst, false, false, true, IDLoc, Out, STI)
? MER_Fail
: MER_Success;
case Mips::LoadImmDoubleFGR_32:
return expandLoadImmReal(Inst, false, false, false, IDLoc, Out, STI)
? MER_Fail
: MER_Success;
case Mips::Ulh:
return expandUlh(Inst, true, IDLoc, Out, STI) ? MER_Fail : MER_Success;
case Mips::Ulhu:
@ -2952,6 +3004,302 @@ bool MipsAsmParser::loadAndAddSymbolAddress(const MCExpr *SymExpr,
return false;
}
// Each double-precision register DO-D15 overlaps with two of the single
// precision registers F0-F31. As an example, all of the following hold true:
// D0 + 1 == F1, F1 + 1 == D1, F1 + 1 == F2, depending on the context.
static unsigned nextReg(unsigned Reg) {
if (MipsMCRegisterClasses[Mips::FGR32RegClassID].contains(Reg))
return Reg == (unsigned)Mips::F31 ? (unsigned)Mips::F0 : Reg + 1;
switch (Reg) {
default: llvm_unreachable("Unknown register in assembly macro expansion!");
case Mips::ZERO: return Mips::AT;
case Mips::AT: return Mips::V0;
case Mips::V0: return Mips::V1;
case Mips::V1: return Mips::A0;
case Mips::A0: return Mips::A1;
case Mips::A1: return Mips::A2;
case Mips::A2: return Mips::A3;
case Mips::A3: return Mips::T0;
case Mips::T0: return Mips::T1;
case Mips::T1: return Mips::T2;
case Mips::T2: return Mips::T3;
case Mips::T3: return Mips::T4;
case Mips::T4: return Mips::T5;
case Mips::T5: return Mips::T6;
case Mips::T6: return Mips::T7;
case Mips::T7: return Mips::S0;
case Mips::S0: return Mips::S1;
case Mips::S1: return Mips::S2;
case Mips::S2: return Mips::S3;
case Mips::S3: return Mips::S4;
case Mips::S4: return Mips::S5;
case Mips::S5: return Mips::S6;
case Mips::S6: return Mips::S7;
case Mips::S7: return Mips::T8;
case Mips::T8: return Mips::T9;
case Mips::T9: return Mips::K0;
case Mips::K0: return Mips::K1;
case Mips::K1: return Mips::GP;
case Mips::GP: return Mips::SP;
case Mips::SP: return Mips::FP;
case Mips::FP: return Mips::RA;
case Mips::RA: return Mips::ZERO;
case Mips::D0: return Mips::F1;
case Mips::D1: return Mips::F3;
case Mips::D2: return Mips::F5;
case Mips::D3: return Mips::F7;
case Mips::D4: return Mips::F9;
case Mips::D5: return Mips::F11;
case Mips::D6: return Mips::F13;
case Mips::D7: return Mips::F15;
case Mips::D8: return Mips::F17;
case Mips::D9: return Mips::F19;
case Mips::D10: return Mips::F21;
case Mips::D11: return Mips::F23;
case Mips::D12: return Mips::F25;
case Mips::D13: return Mips::F27;
case Mips::D14: return Mips::F29;
case Mips::D15: return Mips::F31;
}
}
// FIXME: This method is too general. In principle we should compute the number
// of instructions required to synthesize the immediate inline compared to
// synthesizing the address inline and relying on non .text sections.
// For static O32 and N32 this may yield a small benefit, for static N64 this is
// likely to yield a much larger benefit as we have to synthesize a 64bit
// address to load a 64 bit value.
bool MipsAsmParser::emitPartialAddress(MipsTargetStreamer &TOut, SMLoc IDLoc,
MCSymbol *Sym) {
unsigned ATReg = getATReg(IDLoc);
if (!ATReg)
return true;
if(IsPicEnabled) {
const MCExpr *GotSym =
MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, getContext());
const MipsMCExpr *GotExpr =
MipsMCExpr::create(MipsMCExpr::MEK_GOT, GotSym, getContext());
if(isABI_O32() || isABI_N32()) {
TOut.emitRRX(Mips::LW, ATReg, Mips::GP, MCOperand::createExpr(GotExpr),
IDLoc, STI);
} else { //isABI_N64()
TOut.emitRRX(Mips::LD, ATReg, Mips::GP, MCOperand::createExpr(GotExpr),
IDLoc, STI);
}
} else { //!IsPicEnabled
const MCExpr *HiSym =
MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, getContext());
const MipsMCExpr *HiExpr =
MipsMCExpr::create(MipsMCExpr::MEK_HI, HiSym, getContext());
// FIXME: This is technically correct but gives a different result to gas,
// but gas is incomplete there (it has a fixme noting it doesn't work with
// 64-bit addresses).
// FIXME: With -msym32 option, the address expansion for N64 should probably
// use the O32 / N32 case. It's safe to use the 64 address expansion as the
// symbol's value is considered sign extended.
if(isABI_O32() || isABI_N32()) {
TOut.emitRX(Mips::LUi, ATReg, MCOperand::createExpr(HiExpr), IDLoc, STI);
} else { //isABI_N64()
const MCExpr *HighestSym =
MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, getContext());
const MipsMCExpr *HighestExpr =
MipsMCExpr::create(MipsMCExpr::MEK_HIGHEST, HighestSym, getContext());
const MCExpr *HigherSym =
MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, getContext());
const MipsMCExpr *HigherExpr =
MipsMCExpr::create(MipsMCExpr::MEK_HIGHER, HigherSym, getContext());
TOut.emitRX(Mips::LUi, ATReg, MCOperand::createExpr(HighestExpr), IDLoc,
STI);
TOut.emitRRX(Mips::DADDiu, ATReg, ATReg,
MCOperand::createExpr(HigherExpr), IDLoc, STI);
TOut.emitRRI(Mips::DSLL, ATReg, ATReg, 16, IDLoc, STI);
TOut.emitRRX(Mips::DADDiu, ATReg, ATReg, MCOperand::createExpr(HiExpr),
IDLoc, STI);
TOut.emitRRI(Mips::DSLL, ATReg, ATReg, 16, IDLoc, STI);
}
}
return false;
}
bool MipsAsmParser::expandLoadImmReal(MCInst &Inst, bool IsSingle, bool IsGPR,
bool Is64FPU, SMLoc IDLoc,
MCStreamer &Out,
const MCSubtargetInfo *STI) {
MipsTargetStreamer &TOut = getTargetStreamer();
assert(Inst.getNumOperands() == 2 && "Invalid operand count");
assert(Inst.getOperand(0).isReg() && Inst.getOperand(1).isImm() &&
"Invalid instruction operand.");
unsigned FirstReg = Inst.getOperand(0).getReg();
uint64_t ImmOp64 = Inst.getOperand(1).getImm();
uint32_t HiImmOp64 = (ImmOp64 & 0xffffffff00000000) >> 32;
// If ImmOp64 is AsmToken::Integer type (all bits set to zero in the
// exponent field), convert it to double (e.g. 1 to 1.0)
if ((HiImmOp64 & 0x7ff00000) == 0) {
APFloat RealVal(APFloat::IEEEdouble(), ImmOp64);
ImmOp64 = RealVal.bitcastToAPInt().getZExtValue();
}
uint32_t LoImmOp64 = ImmOp64 & 0xffffffff;
HiImmOp64 = (ImmOp64 & 0xffffffff00000000) >> 32;
if (IsSingle) {
// Conversion of a double in an uint64_t to a float in a uint32_t,
// retaining the bit pattern of a float.
uint32_t ImmOp32;
double doubleImm = BitsToDouble(ImmOp64);
float tmp_float = static_cast<float>(doubleImm);
ImmOp32 = FloatToBits(tmp_float);
if (IsGPR) {
if (loadImmediate(ImmOp32, FirstReg, Mips::NoRegister, true, true, IDLoc,
Out, STI))
return true;
return false;
} else {
unsigned ATReg = getATReg(IDLoc);
if (!ATReg)
return true;
if (LoImmOp64 == 0) {
if (loadImmediate(ImmOp32, ATReg, Mips::NoRegister, true, true, IDLoc,
Out, STI))
return true;
TOut.emitRR(Mips::MTC1, FirstReg, ATReg, IDLoc, STI);
return false;
}
MCSection *CS = getStreamer().getCurrentSectionOnly();
// FIXME: Enhance this expansion to use the .lit4 & .lit8 sections
// where appropriate.
MCSection *ReadOnlySection = getContext().getELFSection(
".rodata", ELF::SHT_PROGBITS, ELF::SHF_ALLOC);
MCSymbol *Sym = getContext().createTempSymbol();
const MCExpr *LoSym =
MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, getContext());
const MipsMCExpr *LoExpr =
MipsMCExpr::create(MipsMCExpr::MEK_LO, LoSym, getContext());
getStreamer().SwitchSection(ReadOnlySection);
getStreamer().EmitLabel(Sym, IDLoc);
getStreamer().EmitIntValue(ImmOp32, 4);
getStreamer().SwitchSection(CS);
if(emitPartialAddress(TOut, IDLoc, Sym))
return true;
TOut.emitRRX(Mips::LWC1, FirstReg, ATReg,
MCOperand::createExpr(LoExpr), IDLoc, STI);
}
return false;
}
// if(!IsSingle)
unsigned ATReg = getATReg(IDLoc);
if (!ATReg)
return true;
if (IsGPR) {
if (LoImmOp64 == 0) {
if(isABI_N32() || isABI_N64()) {
if (loadImmediate(HiImmOp64, FirstReg, Mips::NoRegister, false, true,
IDLoc, Out, STI))
return true;
return false;
} else {
if (loadImmediate(HiImmOp64, FirstReg, Mips::NoRegister, true, true,
IDLoc, Out, STI))
return true;
if (loadImmediate(0, nextReg(FirstReg), Mips::NoRegister, true, true,
IDLoc, Out, STI))
return true;
return false;
}
}
MCSection *CS = getStreamer().getCurrentSectionOnly();
MCSection *ReadOnlySection = getContext().getELFSection(
".rodata", ELF::SHT_PROGBITS, ELF::SHF_ALLOC);
MCSymbol *Sym = getContext().createTempSymbol();
const MCExpr *LoSym =
MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, getContext());
const MipsMCExpr *LoExpr =
MipsMCExpr::create(MipsMCExpr::MEK_LO, LoSym, getContext());
getStreamer().SwitchSection(ReadOnlySection);
getStreamer().EmitLabel(Sym, IDLoc);
getStreamer().EmitIntValue(HiImmOp64, 4);
getStreamer().EmitIntValue(LoImmOp64, 4);
getStreamer().SwitchSection(CS);
if(emitPartialAddress(TOut, IDLoc, Sym))
return true;
if(isABI_N64())
TOut.emitRRX(Mips::DADDiu, ATReg, ATReg,
MCOperand::createExpr(LoExpr), IDLoc, STI);
else
TOut.emitRRX(Mips::ADDiu, ATReg, ATReg,
MCOperand::createExpr(LoExpr), IDLoc, STI);
if(isABI_N32() || isABI_N64())
TOut.emitRRI(Mips::LD, FirstReg, ATReg, 0, IDLoc, STI);
else {
TOut.emitRRI(Mips::LW, FirstReg, ATReg, 0, IDLoc, STI);
TOut.emitRRI(Mips::LW, nextReg(FirstReg), ATReg, 4, IDLoc, STI);
}
return false;
} else { // if(!IsGPR && !IsSingle)
if ((LoImmOp64 == 0) &&
!((HiImmOp64 & 0xffff0000) && (HiImmOp64 & 0x0000ffff))) {
// FIXME: In the case where the constant is zero, we can load the
// register directly from the zero register.
if (loadImmediate(HiImmOp64, ATReg, Mips::NoRegister, true, true, IDLoc,
Out, STI))
return true;
if (isABI_N32() || isABI_N64())
TOut.emitRR(Mips::DMTC1, FirstReg, ATReg, IDLoc, STI);
else if (hasMips32r2()) {
TOut.emitRR(Mips::MTC1, FirstReg, Mips::ZERO, IDLoc, STI);
TOut.emitRRR(Mips::MTHC1_D32, FirstReg, FirstReg, ATReg, IDLoc, STI);
} else {
TOut.emitRR(Mips::MTC1, nextReg(FirstReg), ATReg, IDLoc, STI);
TOut.emitRR(Mips::MTC1, FirstReg, Mips::ZERO, IDLoc, STI);
}
return false;
}
MCSection *CS = getStreamer().getCurrentSectionOnly();
// FIXME: Enhance this expansion to use the .lit4 & .lit8 sections
// where appropriate.
MCSection *ReadOnlySection = getContext().getELFSection(
".rodata", ELF::SHT_PROGBITS, ELF::SHF_ALLOC);
MCSymbol *Sym = getContext().createTempSymbol();
const MCExpr *LoSym =
MCSymbolRefExpr::create(Sym, MCSymbolRefExpr::VK_None, getContext());
const MipsMCExpr *LoExpr =
MipsMCExpr::create(MipsMCExpr::MEK_LO, LoSym, getContext());
getStreamer().SwitchSection(ReadOnlySection);
getStreamer().EmitLabel(Sym, IDLoc);
getStreamer().EmitIntValue(HiImmOp64, 4);
getStreamer().EmitIntValue(LoImmOp64, 4);
getStreamer().SwitchSection(CS);
if(emitPartialAddress(TOut, IDLoc, Sym))
return true;
TOut.emitRRX(Is64FPU ? Mips::LDC164 : Mips::LDC1, FirstReg, ATReg,
MCOperand::createExpr(LoExpr), IDLoc, STI);
}
return false;
}
bool MipsAsmParser::expandUncondBranchMMPseudo(MCInst &Inst, SMLoc IDLoc,
MCStreamer &Out,
const MCSubtargetInfo *STI) {
@ -4318,45 +4666,6 @@ bool MipsAsmParser::expandDMULMacro(MCInst &Inst, SMLoc IDLoc, MCStreamer &Out,
return false;
}
static unsigned nextReg(unsigned Reg) {
switch (Reg) {
case Mips::ZERO: return Mips::AT;
case Mips::AT: return Mips::V0;
case Mips::V0: return Mips::V1;
case Mips::V1: return Mips::A0;
case Mips::A0: return Mips::A1;
case Mips::A1: return Mips::A2;
case Mips::A2: return Mips::A3;
case Mips::A3: return Mips::T0;
case Mips::T0: return Mips::T1;
case Mips::T1: return Mips::T2;
case Mips::T2: return Mips::T3;
case Mips::T3: return Mips::T4;
case Mips::T4: return Mips::T5;
case Mips::T5: return Mips::T6;
case Mips::T6: return Mips::T7;
case Mips::T7: return Mips::S0;
case Mips::S0: return Mips::S1;
case Mips::S1: return Mips::S2;
case Mips::S2: return Mips::S3;
case Mips::S3: return Mips::S4;
case Mips::S4: return Mips::S5;
case Mips::S5: return Mips::S6;
case Mips::S6: return Mips::S7;
case Mips::S7: return Mips::T8;
case Mips::T8: return Mips::T9;
case Mips::T9: return Mips::K0;
case Mips::K0: return Mips::K1;
case Mips::K1: return Mips::GP;
case Mips::GP: return Mips::SP;
case Mips::SP: return Mips::FP;
case Mips::FP: return Mips::RA;
case Mips::RA: return Mips::ZERO;
default: return 0;
}
}
// Expand 'ld $<reg> offset($reg2)' to 'lw $<reg>, offset($reg2);
// lw $<reg+1>>, offset+4($reg2)'
// or expand 'sd $<reg> offset($reg2)' to 'sw $<reg>, offset($reg2);

@ -362,7 +362,6 @@ MipsTargetLowering::MipsTargetLowering(const MipsTargetMachine &TM,
setOperationAction(ISD::FCOS, MVT::f64, Expand);
setOperationAction(ISD::FSINCOS, MVT::f32, Expand);
setOperationAction(ISD::FSINCOS, MVT::f64, Expand);
setOperationAction(ISD::FPOWI, MVT::f32, Expand);
setOperationAction(ISD::FPOW, MVT::f32, Expand);
setOperationAction(ISD::FPOW, MVT::f64, Expand);
setOperationAction(ISD::FLOG, MVT::f32, Expand);

@ -681,6 +681,29 @@ def PseudoTRUNC_W_D : MipsAsmPseudoInst<(outs FGR32Opnd:$fd),
"trunc.w.d\t$fd, $fs, $rs">,
FGR_64, HARDFLOAT;
def LoadImmSingleGPR : MipsAsmPseudoInst<(outs GPR32Opnd:$rd),
(ins imm64:$fpimm),
"li.s\t$rd, $fpimm">;
def LoadImmSingleFGR : MipsAsmPseudoInst<(outs StrictlyFGR32Opnd:$rd),
(ins imm64:$fpimm),
"li.s\t$rd, $fpimm">,
HARDFLOAT;
def LoadImmDoubleGPR : MipsAsmPseudoInst<(outs GPR32Opnd:$rd),
(ins imm64:$fpimm),
"li.d\t$rd, $fpimm">;
def LoadImmDoubleFGR_32 : MipsAsmPseudoInst<(outs StrictlyAFGR64Opnd:$rd),
(ins imm64:$fpimm),
"li.d\t$rd, $fpimm">,
FGR_32, HARDFLOAT;
def LoadImmDoubleFGR : MipsAsmPseudoInst<(outs StrictlyFGR64Opnd:$rd),
(ins imm64:$fpimm),
"li.d\t$rd, $fpimm">,
FGR_64, HARDFLOAT;
//===----------------------------------------------------------------------===//
// InstAliases.
//===----------------------------------------------------------------------===//

@ -552,16 +552,31 @@ def AFGR64AsmOperand : MipsAsmRegOperand {
let PredicateMethod = "isFGRAsmReg";
}
def StrictlyAFGR64AsmOperand : MipsAsmRegOperand {
let Name = "StrictlyAFGR64AsmReg";
let PredicateMethod = "isStrictlyFGRAsmReg";
}
def FGR64AsmOperand : MipsAsmRegOperand {
let Name = "FGR64AsmReg";
let PredicateMethod = "isFGRAsmReg";
}
def StrictlyFGR64AsmOperand : MipsAsmRegOperand {
let Name = "StrictlyFGR64AsmReg";
let PredicateMethod = "isStrictlyFGRAsmReg";
}
def FGR32AsmOperand : MipsAsmRegOperand {
let Name = "FGR32AsmReg";
let PredicateMethod = "isFGRAsmReg";
}
def StrictlyFGR32AsmOperand : MipsAsmRegOperand {
let Name = "StrictlyFGR32AsmReg";
let PredicateMethod = "isStrictlyFGRAsmReg";
}
def FGRH32AsmOperand : MipsAsmRegOperand {
let Name = "FGRH32AsmReg";
let PredicateMethod = "isFGRAsmReg";
@ -639,14 +654,26 @@ def AFGR64Opnd : RegisterOperand<AFGR64> {
let ParserMatchClass = AFGR64AsmOperand;
}
def StrictlyAFGR64Opnd : RegisterOperand<AFGR64> {
let ParserMatchClass = StrictlyAFGR64AsmOperand;
}
def FGR64Opnd : RegisterOperand<FGR64> {
let ParserMatchClass = FGR64AsmOperand;
}
def StrictlyFGR64Opnd : RegisterOperand<FGR64> {
let ParserMatchClass = StrictlyFGR64AsmOperand;
}
def FGR32Opnd : RegisterOperand<FGR32> {
let ParserMatchClass = FGR32AsmOperand;
}
def StrictlyFGR32Opnd : RegisterOperand<FGR32> {
let ParserMatchClass = StrictlyFGR32AsmOperand;
}
def FGRCCOpnd : RegisterOperand<FGRCC> {
// The assembler doesn't use register classes so we can re-use
// FGR32AsmOperand.

@ -539,7 +539,6 @@ PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
setOperationAction(ISD::FSIN, VT, Expand);
setOperationAction(ISD::FCOS, VT, Expand);
setOperationAction(ISD::FABS, VT, Expand);
setOperationAction(ISD::FPOWI, VT, Expand);
setOperationAction(ISD::FFLOOR, VT, Expand);
setOperationAction(ISD::FCEIL, VT, Expand);
setOperationAction(ISD::FTRUNC, VT, Expand);
@ -798,7 +797,6 @@ PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
setOperationAction(ISD::FABS , MVT::v4f64, Legal);
setOperationAction(ISD::FSIN , MVT::v4f64, Expand);
setOperationAction(ISD::FCOS , MVT::v4f64, Expand);
setOperationAction(ISD::FPOWI , MVT::v4f64, Expand);
setOperationAction(ISD::FPOW , MVT::v4f64, Expand);
setOperationAction(ISD::FLOG , MVT::v4f64, Expand);
setOperationAction(ISD::FLOG2 , MVT::v4f64, Expand);
@ -844,7 +842,6 @@ PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
setOperationAction(ISD::FABS , MVT::v4f32, Legal);
setOperationAction(ISD::FSIN , MVT::v4f32, Expand);
setOperationAction(ISD::FCOS , MVT::v4f32, Expand);
setOperationAction(ISD::FPOWI , MVT::v4f32, Expand);
setOperationAction(ISD::FPOW , MVT::v4f32, Expand);
setOperationAction(ISD::FLOG , MVT::v4f32, Expand);
setOperationAction(ISD::FLOG2 , MVT::v4f32, Expand);

@ -54,6 +54,8 @@ include "SystemZInstrFormats.td"
include "SystemZInstrInfo.td"
include "SystemZInstrVector.td"
include "SystemZInstrFP.td"
include "SystemZInstrHFP.td"
include "SystemZInstrDFP.td"
def SystemZInstrInfo : InstrInfo {}

@ -115,12 +115,18 @@ def FeatureTransactionalExecution : SystemZFeature<
"Assume that the transactional-execution facility is installed"
>;
def FeatureDFPZonedConversion : SystemZFeature<
"dfp-zoned-conversion", "DFPZonedConversion",
"Assume that the DFP zoned-conversion facility is installed"
>;
def Arch10NewFeatures : SystemZFeatureList<[
FeatureExecutionHint,
FeatureLoadAndTrap,
FeatureMiscellaneousExtensions,
FeatureProcessorAssist,
FeatureTransactionalExecution
FeatureTransactionalExecution,
FeatureDFPZonedConversion
]>;
//===----------------------------------------------------------------------===//
@ -144,6 +150,11 @@ def FeatureMessageSecurityAssist5 : SystemZFeature<
"Assume that the message-security-assist extension facility 5 is installed"
>;
def FeatureDFPPackedConversion : SystemZFeature<
"dfp-packed-conversion", "DFPPackedConversion",
"Assume that the DFP packed-conversion facility is installed"
>;
def FeatureVector : SystemZFeature<
"vector", "Vector",
"Assume that the vectory facility is installed"
@ -154,6 +165,7 @@ def Arch11NewFeatures : SystemZFeatureList<[
FeatureLoadAndZeroRightmostByte,
FeatureLoadStoreOnCond2,
FeatureMessageSecurityAssist5,
FeatureDFPPackedConversion,
FeatureVector
]>;

@ -4189,12 +4189,20 @@ static SDValue buildVector(SelectionDAG &DAG, const SDLoc &DL, EVT VT,
if (Single.getNode() && (Count > 1 || Single.getOpcode() == ISD::LOAD))
return DAG.getNode(SystemZISD::REPLICATE, DL, VT, Single);
// If all elements are loads, use VLREP/VLEs (below).
bool AllLoads = true;
for (auto Elem : Elems)
if (Elem.getOpcode() != ISD::LOAD || cast<LoadSDNode>(Elem)->isIndexed()) {
AllLoads = false;
break;
}
// The best way of building a v2i64 from two i64s is to use VLVGP.
if (VT == MVT::v2i64)
if (VT == MVT::v2i64 && !AllLoads)
return joinDwords(DAG, DL, Elems[0], Elems[1]);
// Use a 64-bit merge high to combine two doubles.
if (VT == MVT::v2f64)
if (VT == MVT::v2f64 && !AllLoads)
return buildMergeScalars(DAG, DL, VT, Elems[0], Elems[1]);
// Build v4f32 values directly from the FPRs:
@ -4204,7 +4212,7 @@ static SDValue buildVector(SelectionDAG &DAG, const SDLoc &DL, EVT VT,
// <ABxx> <CDxx>
// V VMRHG
// <ABCD>
if (VT == MVT::v4f32) {
if (VT == MVT::v4f32 && !AllLoads) {
SDValue Op01 = buildMergeScalars(DAG, DL, VT, Elems[0], Elems[1]);
SDValue Op23 = buildMergeScalars(DAG, DL, VT, Elems[2], Elems[3]);
// Avoid unnecessary undefs by reusing the other operand.
@ -4246,23 +4254,37 @@ static SDValue buildVector(SelectionDAG &DAG, const SDLoc &DL, EVT VT,
Constants[I] = DAG.getUNDEF(Elems[I].getValueType());
Result = DAG.getBuildVector(VT, DL, Constants);
} else {
// Otherwise try to use VLVGP to start the sequence in order to
// Otherwise try to use VLREP or VLVGP to start the sequence in order to
// avoid a false dependency on any previous contents of the vector
// register. This only makes sense if one of the associated elements
// is defined.
unsigned I1 = NumElements / 2 - 1;
unsigned I2 = NumElements - 1;
bool Def1 = !Elems[I1].isUndef();
bool Def2 = !Elems[I2].isUndef();
if (Def1 || Def2) {
SDValue Elem1 = Elems[Def1 ? I1 : I2];
SDValue Elem2 = Elems[Def2 ? I2 : I1];
Result = DAG.getNode(ISD::BITCAST, DL, VT,
joinDwords(DAG, DL, Elem1, Elem2));
Done[I1] = true;
Done[I2] = true;
} else
Result = DAG.getUNDEF(VT);
// register.
// Use a VLREP if at least one element is a load.
unsigned LoadElIdx = UINT_MAX;
for (unsigned I = 0; I < NumElements; ++I)
if (Elems[I].getOpcode() == ISD::LOAD &&
cast<LoadSDNode>(Elems[I])->isUnindexed()) {
LoadElIdx = I;
break;
}
if (LoadElIdx != UINT_MAX) {
Result = DAG.getNode(SystemZISD::REPLICATE, DL, VT, Elems[LoadElIdx]);
Done[LoadElIdx] = true;
} else {
// Try to use VLVGP.
unsigned I1 = NumElements / 2 - 1;
unsigned I2 = NumElements - 1;
bool Def1 = !Elems[I1].isUndef();
bool Def2 = !Elems[I2].isUndef();
if (Def1 || Def2) {
SDValue Elem1 = Elems[Def1 ? I1 : I2];
SDValue Elem2 = Elems[Def2 ? I2 : I1];
Result = DAG.getNode(ISD::BITCAST, DL, VT,
joinDwords(DAG, DL, Elem1, Elem2));
Done[I1] = true;
Done[I2] = true;
} else
Result = DAG.getUNDEF(VT);
}
}
// Use VLVGx to insert the other elements.

@ -0,0 +1,231 @@
//==- SystemZInstrDFP.td - Floating-point SystemZ instructions -*- tblgen-*-==//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// The instructions in this file implement SystemZ decimal floating-point
// arithmetic. These instructions are inot currently used for code generation,
// are provided for use with the assembler and disassembler only. If LLVM
// ever supports decimal floating-point types (_Decimal64 etc.), they can
// also be used for code generation for those types.
//
//===----------------------------------------------------------------------===//
//===----------------------------------------------------------------------===//
// Move instructions
//===----------------------------------------------------------------------===//
// Load and test.
let Defs = [CC] in {
def LTDTR : UnaryRRE<"ltdtr", 0xB3D6, null_frag, FP64, FP64>;
def LTXTR : UnaryRRE<"ltxtr", 0xB3DE, null_frag, FP128, FP128>;
}
//===----------------------------------------------------------------------===//
// Conversion instructions
//===----------------------------------------------------------------------===//
// Convert floating-point values to narrower representations. The destination
// of LDXTR is a 128-bit value, but only the first register of the pair is used.
def LEDTR : TernaryRRFe<"ledtr", 0xB3D5, FP32, FP64>;
def LDXTR : TernaryRRFe<"ldxtr", 0xB3DD, FP128, FP128>;
// Extend floating-point values to wider representations.
def LDETR : BinaryRRFd<"ldetr", 0xB3D4, FP64, FP32>;
def LXDTR : BinaryRRFd<"lxdtr", 0xB3DC, FP128, FP64>;
// Convert a signed integer value to a floating-point one.
def CDGTR : UnaryRRE<"cdgtr", 0xB3F1, null_frag, FP64, GR64>;
def CXGTR : UnaryRRE<"cxgtr", 0xB3F9, null_frag, FP128, GR64>;
let Predicates = [FeatureFPExtension] in {
def CDGTRA : TernaryRRFe<"cdgtra", 0xB3F1, FP64, GR64>;
def CXGTRA : TernaryRRFe<"cxgtra", 0xB3F9, FP128, GR64>;
def CDFTR : TernaryRRFe<"cdftr", 0xB951, FP64, GR32>;
def CXFTR : TernaryRRFe<"cxftr", 0xB959, FP128, GR32>;
}
// Convert an unsigned integer value to a floating-point one.
let Predicates = [FeatureFPExtension] in {
def CDLGTR : TernaryRRFe<"cdlgtr", 0xB952, FP64, GR64>;
def CXLGTR : TernaryRRFe<"cxlgtr", 0xB95A, FP128, GR64>;
def CDLFTR : TernaryRRFe<"cdlftr", 0xB953, FP64, GR32>;
def CXLFTR : TernaryRRFe<"cxlftr", 0xB95B, FP128, GR32>;
}
// Convert a floating-point value to a signed integer value.
let Defs = [CC] in {
def CGDTR : BinaryRRFe<"cgdtr", 0xB3E1, GR64, FP64>;
def CGXTR : BinaryRRFe<"cgxtr", 0xB3E9, GR64, FP128>;
let Predicates = [FeatureFPExtension] in {
def CGDTRA : TernaryRRFe<"cgdtra", 0xB3E1, GR64, FP64>;
def CGXTRA : TernaryRRFe<"cgxtra", 0xB3E9, GR64, FP128>;
def CFDTR : TernaryRRFe<"cfdtr", 0xB941, GR32, FP64>;
def CFXTR : TernaryRRFe<"cfxtr", 0xB949, GR32, FP128>;
}
}
// Convert a floating-point value to an unsigned integer value.
let Defs = [CC] in {
let Predicates = [FeatureFPExtension] in {
def CLGDTR : TernaryRRFe<"clgdtr", 0xB942, GR64, FP64>;
def CLGXTR : TernaryRRFe<"clgxtr", 0xB94A, GR64, FP128>;
def CLFDTR : TernaryRRFe<"clfdtr", 0xB943, GR32, FP64>;
def CLFXTR : TernaryRRFe<"clfxtr", 0xB94B, GR32, FP128>;
}
}
// Convert a packed value to a floating-point one.
def CDSTR : UnaryRRE<"cdstr", 0xB3F3, null_frag, FP64, GR64>;
def CXSTR : UnaryRRE<"cxstr", 0xB3FB, null_frag, FP128, GR128>;
def CDUTR : UnaryRRE<"cdutr", 0xB3F2, null_frag, FP64, GR64>;
def CXUTR : UnaryRRE<"cxutr", 0xB3FA, null_frag, FP128, GR128>;
// Convert a floating-point value to a packed value.
def CSDTR : BinaryRRFd<"csdtr", 0xB3E3, GR64, FP64>;
def CSXTR : BinaryRRFd<"csxtr", 0xB3EB, GR128, FP128>;
def CUDTR : UnaryRRE<"cudtr", 0xB3E2, null_frag, GR64, FP64>;
def CUXTR : UnaryRRE<"cuxtr", 0xB3EA, null_frag, GR128, FP128>;
// Convert from/to memory values in the zoned format.
let Predicates = [FeatureDFPZonedConversion] in {
def CDZT : BinaryRSL<"cdzt", 0xEDAA, FP64>;
def CXZT : BinaryRSL<"cxzt", 0xEDAB, FP128>;
def CZDT : StoreBinaryRSL<"czdt", 0xEDA8, FP64>;
def CZXT : StoreBinaryRSL<"czxt", 0xEDA9, FP128>;
}
// Convert from/to memory values in the packed format.
let Predicates = [FeatureDFPPackedConversion] in {
def CDPT : BinaryRSL<"cdpt", 0xEDAE, FP64>;
def CXPT : BinaryRSL<"cxpt", 0xEDAF, FP128>;
def CPDT : StoreBinaryRSL<"cpdt", 0xEDAC, FP64>;
def CPXT : StoreBinaryRSL<"cpxt", 0xEDAD, FP128>;
}
// Perform floating-point operation.
let Defs = [CC, R1L, F0Q], Uses = [R0L, F4Q] in
def PFPO : SideEffectInherentE<"pfpo", 0x010A>;
//===----------------------------------------------------------------------===//
// Unary arithmetic
//===----------------------------------------------------------------------===//
// Round to an integer, with the second operand (M3) specifying the rounding
// mode. M4 can be set to 4 to suppress detection of inexact conditions.
def FIDTR : TernaryRRFe<"fidtr", 0xB3D7, FP64, FP64>;
def FIXTR : TernaryRRFe<"fixtr", 0xB3DF, FP128, FP128>;
// Extract biased exponent.
def EEDTR : UnaryRRE<"eedtr", 0xB3E5, null_frag, FP64, FP64>;
def EEXTR : UnaryRRE<"eextr", 0xB3ED, null_frag, FP128, FP128>;
// Extract significance.
def ESDTR : UnaryRRE<"esdtr", 0xB3E7, null_frag, FP64, FP64>;
def ESXTR : UnaryRRE<"esxtr", 0xB3EF, null_frag, FP128, FP128>;
//===----------------------------------------------------------------------===//
// Binary arithmetic
//===----------------------------------------------------------------------===//
// Addition.
let Defs = [CC] in {
let isCommutable = 1 in {
def ADTR : BinaryRRFa<"adtr", 0xB3D2, null_frag, FP64, FP64, FP64>;
def AXTR : BinaryRRFa<"axtr", 0xB3DA, null_frag, FP128, FP128, FP128>;
}
let Predicates = [FeatureFPExtension] in {
def ADTRA : TernaryRRFa<"adtra", 0xB3D2, FP64, FP64, FP64>;
def AXTRA : TernaryRRFa<"axtra", 0xB3DA, FP128, FP128, FP128>;
}
}
// Subtraction.
let Defs = [CC] in {
def SDTR : BinaryRRFa<"sdtr", 0xB3D3, null_frag, FP64, FP64, FP64>;
def SXTR : BinaryRRFa<"sxtr", 0xB3DB, null_frag, FP128, FP128, FP128>;
let Predicates = [FeatureFPExtension] in {
def SDTRA : TernaryRRFa<"sdtra", 0xB3D3, FP64, FP64, FP64>;
def SXTRA : TernaryRRFa<"sxtra", 0xB3DB, FP128, FP128, FP128>;
}
}
// Multiplication.
let isCommutable = 1 in {
def MDTR : BinaryRRFa<"mdtr", 0xB3D0, null_frag, FP64, FP64, FP64>;
def MXTR : BinaryRRFa<"mxtr", 0xB3D8, null_frag, FP128, FP128, FP128>;
}
let Predicates = [FeatureFPExtension] in {
def MDTRA : TernaryRRFa<"mdtra", 0xB3D0, FP64, FP64, FP64>;
def MXTRA : TernaryRRFa<"mxtra", 0xB3D8, FP128, FP128, FP128>;
}
// Division.
def DDTR : BinaryRRFa<"ddtr", 0xB3D1, null_frag, FP64, FP64, FP64>;
def DXTR : BinaryRRFa<"dxtr", 0xB3D9, null_frag, FP128, FP128, FP128>;
let Predicates = [FeatureFPExtension] in {
def DDTRA : TernaryRRFa<"ddtra", 0xB3D1, FP64, FP64, FP64>;
def DXTRA : TernaryRRFa<"dxtra", 0xB3D9, FP128, FP128, FP128>;
}
// Quantize.
def QADTR : TernaryRRFb<"qadtr", 0xB3F5, FP64, FP64, FP64>;
def QAXTR : TernaryRRFb<"qaxtr", 0xB3FD, FP128, FP128, FP128>;
// Reround.
def RRDTR : TernaryRRFb<"rrdtr", 0xB3F7, FP64, FP64, FP64>;
def RRXTR : TernaryRRFb<"rrxtr", 0xB3FF, FP128, FP128, FP128>;
// Shift significand left/right.
def SLDT : BinaryRXF<"sldt", 0xED40, null_frag, FP64, FP64, null_frag, 0>;
def SLXT : BinaryRXF<"slxt", 0xED48, null_frag, FP128, FP128, null_frag, 0>;
def SRDT : BinaryRXF<"srdt", 0xED41, null_frag, FP64, FP64, null_frag, 0>;
def SRXT : BinaryRXF<"srxt", 0xED49, null_frag, FP128, FP128, null_frag, 0>;
// Insert biased exponent.
def IEDTR : BinaryRRFb<"iedtr", 0xB3F6, null_frag, FP64, FP64, FP64>;
def IEXTR : BinaryRRFb<"iextr", 0xB3FE, null_frag, FP128, FP128, FP128>;
//===----------------------------------------------------------------------===//
// Comparisons
//===----------------------------------------------------------------------===//
// Compare.
let Defs = [CC] in {
def CDTR : CompareRRE<"cdtr", 0xB3E4, null_frag, FP64, FP64>;
def CXTR : CompareRRE<"cxtr", 0xB3EC, null_frag, FP128, FP128>;
}
// Compare and signal.
let Defs = [CC] in {
def KDTR : CompareRRE<"kdtr", 0xB3E0, null_frag, FP64, FP64>;
def KXTR : CompareRRE<"kxtr", 0xB3E8, null_frag, FP128, FP128>;
}
// Compare biased exponent.
let Defs = [CC] in {
def CEDTR : CompareRRE<"cedtr", 0xB3F4, null_frag, FP64, FP64>;
def CEXTR : CompareRRE<"cextr", 0xB3FC, null_frag, FP128, FP128>;
}
// Test Data Class.
let Defs = [CC] in {
def TDCET : TestRXE<"tdcet", 0xED50, null_frag, FP32>;
def TDCDT : TestRXE<"tdcdt", 0xED54, null_frag, FP64>;
def TDCXT : TestRXE<"tdcxt", 0xED58, null_frag, FP128>;
}
// Test Data Group.
let Defs = [CC] in {
def TDGET : TestRXE<"tdget", 0xED51, null_frag, FP32>;
def TDGDT : TestRXE<"tdgdt", 0xED55, null_frag, FP64>;
def TDGXT : TestRXE<"tdgxt", 0xED59, null_frag, FP128>;
}

@ -121,7 +121,8 @@ let canFoldAsLoad = 1, SimpleBDXLoad = 1 in {
defm LD : UnaryRXPair<"ld", 0x68, 0xED65, load, FP64, 8>;
// For z13 we prefer LDE over LE to avoid partial register dependencies.
def LDE32 : UnaryRXE<"lde", 0xED24, null_frag, FP32, 4>;
let isCodeGenOnly = 1 in
def LDE32 : UnaryRXE<"lde", 0xED24, null_frag, FP32, 4>;
// These instructions are split after register allocation, so we don't
// want a custom inserter.
@ -437,18 +438,18 @@ def : Pat<(fmul (f128 (fpextend FP64:$src1)),
bdxaddr12only:$addr)>;
// Fused multiply-add.
def MAEBR : TernaryRRD<"maebr", 0xB30E, z_fma, FP32>;
def MADBR : TernaryRRD<"madbr", 0xB31E, z_fma, FP64>;
def MAEBR : TernaryRRD<"maebr", 0xB30E, z_fma, FP32, FP32>;
def MADBR : TernaryRRD<"madbr", 0xB31E, z_fma, FP64, FP64>;
def MAEB : TernaryRXF<"maeb", 0xED0E, z_fma, FP32, load, 4>;
def MADB : TernaryRXF<"madb", 0xED1E, z_fma, FP64, load, 8>;
def MAEB : TernaryRXF<"maeb", 0xED0E, z_fma, FP32, FP32, load, 4>;
def MADB : TernaryRXF<"madb", 0xED1E, z_fma, FP64, FP64, load, 8>;
// Fused multiply-subtract.
def MSEBR : TernaryRRD<"msebr", 0xB30F, z_fms, FP32>;
def MSDBR : TernaryRRD<"msdbr", 0xB31F, z_fms, FP64>;
def MSEBR : TernaryRRD<"msebr", 0xB30F, z_fms, FP32, FP32>;
def MSDBR : TernaryRRD<"msdbr", 0xB31F, z_fms, FP64, FP64>;
def MSEB : TernaryRXF<"mseb", 0xED0F, z_fms, FP32, load, 4>;
def MSDB : TernaryRXF<"msdb", 0xED1F, z_fms, FP64, load, 8>;
def MSEB : TernaryRXF<"mseb", 0xED0F, z_fms, FP32, FP32, load, 4>;
def MSDB : TernaryRXF<"msdb", 0xED1F, z_fms, FP64, FP64, load, 8>;
// Division.
def DEBR : BinaryRRE<"debr", 0xB30D, fdiv, FP32, FP32>;

@ -527,6 +527,22 @@ class InstRRFc<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern>
let Inst{3-0} = R2;
}
class InstRRFd<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern>
: InstSystemZ<4, outs, ins, asmstr, pattern> {
field bits<32> Inst;
field bits<32> SoftFail = 0;
bits<4> R1;
bits<4> R2;
bits<4> M4;
let Inst{31-16} = op;
let Inst{15-12} = 0;
let Inst{11-8} = M4;
let Inst{7-4} = R1;
let Inst{3-0} = R2;
}
class InstRRFe<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern>
: InstSystemZ<4, outs, ins, asmstr, pattern> {
field bits<32> Inst;
@ -725,6 +741,22 @@ class InstRSLa<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern>
let Inst{7-0} = op{7-0};
}
class InstRSLb<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern>
: InstSystemZ<6, outs, ins, asmstr, pattern> {
field bits<48> Inst;
field bits<48> SoftFail = 0;
bits<4> R1;
bits<24> BDL2;
bits<4> M3;
let Inst{47-40} = op{15-8};
let Inst{39-16} = BDL2;
let Inst{15-12} = R1;
let Inst{11-8} = M3;
let Inst{7-0} = op{7-0};
}
class InstRSYa<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern>
: InstSystemZ<6, outs, ins, asmstr, pattern> {
field bits<48> Inst;
@ -2752,6 +2784,15 @@ class BinaryRRE<string mnemonic, bits<16> opcode, SDPatternOperator operator,
let DisableEncoding = "$R1src";
}
class BinaryRRD<string mnemonic, bits<16> opcode, SDPatternOperator operator,
RegisterOperand cls1, RegisterOperand cls2>
: InstRRD<opcode, (outs cls1:$R1), (ins cls2:$R3, cls2:$R2),
mnemonic#"\t$R1, $R3, $R2",
[(set cls1:$R1, (operator cls2:$R3, cls2:$R2))]> {
let OpKey = mnemonic#cls;
let OpType = "reg";
}
class BinaryRRFa<string mnemonic, bits<16> opcode, SDPatternOperator operator,
RegisterOperand cls1, RegisterOperand cls2,
RegisterOperand cls3>
@ -2808,6 +2849,11 @@ multiclass BinaryMemRRFcOpt<string mnemonic, bits<16> opcode,
def Opt : UnaryMemRRFc<mnemonic, opcode, cls1, cls2>;
}
class BinaryRRFd<string mnemonic, bits<16> opcode, RegisterOperand cls1,
RegisterOperand cls2>
: InstRRFd<opcode, (outs cls1:$R1), (ins cls2:$R2, imm32zx4:$M4),
mnemonic#"\t$R1, $R2, $M4", []>;
class BinaryRRFe<string mnemonic, bits<16> opcode, RegisterOperand cls1,
RegisterOperand cls2>
: InstRRFe<opcode, (outs cls1:$R1), (ins imm32zx4:$M3, cls2:$R2),
@ -2958,6 +3004,13 @@ multiclass BinaryRSAndK<string mnemonic, bits<8> opcode1, bits<16> opcode2,
}
}
class BinaryRSL<string mnemonic, bits<16> opcode, RegisterOperand cls>
: InstRSLb<opcode, (outs cls:$R1),
(ins bdladdr12onlylen8:$BDL2, imm32zx4:$M3),
mnemonic#"\t$R1, $BDL2, $M3", []> {
let mayLoad = 1;
}
class BinaryRX<string mnemonic, bits<8> opcode, SDPatternOperator operator,
RegisterOperand cls, SDPatternOperator load, bits<5> bytes,
AddressingMode mode = bdxaddr12only>
@ -2987,6 +3040,18 @@ class BinaryRXE<string mnemonic, bits<16> opcode, SDPatternOperator operator,
let M3 = 0;
}
class BinaryRXF<string mnemonic, bits<16> opcode, SDPatternOperator operator,
RegisterOperand cls1, RegisterOperand cls2,
SDPatternOperator load, bits<5> bytes>
: InstRXF<opcode, (outs cls1:$R1), (ins cls2:$R3, bdxaddr12only:$XBD2),
mnemonic#"\t$R1, $R3, $XBD2",
[(set cls1:$R1, (operator cls2:$R3, (load bdxaddr12only:$XBD2)))]> {
let OpKey = mnemonic#"r"#cls;
let OpType = "mem";
let mayLoad = 1;
let AccessBytes = bytes;
}
class BinaryRXY<string mnemonic, bits<16> opcode, SDPatternOperator operator,
RegisterOperand cls, SDPatternOperator load, bits<5> bytes,
AddressingMode mode = bdxaddr20only>
@ -3294,6 +3359,13 @@ multiclass StoreBinaryRSPair<string mnemonic, bits<8> rsOpcode,
}
}
class StoreBinaryRSL<string mnemonic, bits<16> opcode, RegisterOperand cls>
: InstRSLb<opcode, (outs),
(ins cls:$R1, bdladdr12onlylen8:$BDL2, imm32zx4:$M3),
mnemonic#"\t$R1, $BDL2, $M3", []> {
let mayStore = 1;
}
class StoreBinaryVRV<string mnemonic, bits<16> opcode, bits<5> bytes,
Immediate index>
: InstVRV<opcode, (outs), (ins VR128:$V1, bdvaddr12only:$VBD2, index:$M3),
@ -3581,6 +3653,12 @@ class SideEffectTernarySSF<string mnemonic, bits<12> opcode,
(ins bdaddr12only:$BD1, bdaddr12only:$BD2, cls:$R3),
mnemonic#"\t$BD1, $BD2, $R3", []>;
class TernaryRRFa<string mnemonic, bits<16> opcode,
RegisterOperand cls1, RegisterOperand cls2,
RegisterOperand cls3>
: InstRRFa<opcode, (outs cls1:$R1), (ins cls2:$R2, cls3:$R3, imm32zx4:$M4),
mnemonic#"\t$R1, $R2, $R3, $M4", []>;
class TernaryRRFb<string mnemonic, bits<16> opcode,
RegisterOperand cls1, RegisterOperand cls2,
RegisterOperand cls3>
@ -3597,11 +3675,11 @@ class TernaryRRFe<string mnemonic, bits<16> opcode, RegisterOperand cls1,
(ins imm32zx4:$M3, cls2:$R2, imm32zx4:$M4),
mnemonic#"\t$R1, $M3, $R2, $M4", []>;
class TernaryRRD<string mnemonic, bits<16> opcode,
SDPatternOperator operator, RegisterOperand cls>
: InstRRD<opcode, (outs cls:$R1), (ins cls:$R1src, cls:$R3, cls:$R2),
class TernaryRRD<string mnemonic, bits<16> opcode, SDPatternOperator operator,
RegisterOperand cls1, RegisterOperand cls2>
: InstRRD<opcode, (outs cls1:$R1), (ins cls2:$R1src, cls2:$R3, cls2:$R2),
mnemonic#"\t$R1, $R3, $R2",
[(set cls:$R1, (operator cls:$R1src, cls:$R3, cls:$R2))]> {
[(set cls1:$R1, (operator cls2:$R1src, cls2:$R3, cls2:$R2))]> {
let OpKey = mnemonic#cls;
let OpType = "reg";
let Constraints = "$R1 = $R1src";
@ -3661,12 +3739,13 @@ class SideEffectTernaryMemMemRSY<string mnemonic, bits<16> opcode,
}
class TernaryRXF<string mnemonic, bits<16> opcode, SDPatternOperator operator,
RegisterOperand cls, SDPatternOperator load, bits<5> bytes>
: InstRXF<opcode, (outs cls:$R1),
(ins cls:$R1src, cls:$R3, bdxaddr12only:$XBD2),
RegisterOperand cls1, RegisterOperand cls2,
SDPatternOperator load, bits<5> bytes>
: InstRXF<opcode, (outs cls1:$R1),
(ins cls2:$R1src, cls2:$R3, bdxaddr12only:$XBD2),
mnemonic#"\t$R1, $R3, $XBD2",
[(set cls:$R1, (operator cls:$R1src, cls:$R3,
(load bdxaddr12only:$XBD2)))]> {
[(set cls1:$R1, (operator cls2:$R1src, cls2:$R3,
(load bdxaddr12only:$XBD2)))]> {
let OpKey = mnemonic#"r"#cls;
let OpType = "mem";
let Constraints = "$R1 = $R1src";

@ -0,0 +1,240 @@
//==- SystemZInstrHFP.td - Floating-point SystemZ instructions -*- tblgen-*-==//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// The instructions in this file implement SystemZ hexadecimal floating-point
// arithmetic. Since this format is not mapped to any source-language data
// type, these instructions are not used for code generation, but are provided
// for use with the assembler and disassembler only.
//
//===----------------------------------------------------------------------===//
//===----------------------------------------------------------------------===//
// Move instructions
//===----------------------------------------------------------------------===//
// Load and test.
let Defs = [CC] in {
def LTER : UnaryRR <"lter", 0x32, null_frag, FP32, FP32>;
def LTDR : UnaryRR <"ltdr", 0x22, null_frag, FP64, FP64>;
def LTXR : UnaryRRE<"ltxr", 0xB362, null_frag, FP128, FP128>;
}
//===----------------------------------------------------------------------===//
// Conversion instructions
//===----------------------------------------------------------------------===//
// Convert floating-point values to narrower representations.
def LEDR : UnaryRR <"ledr", 0x35, null_frag, FP32, FP64>;
def LEXR : UnaryRRE<"lexr", 0xB366, null_frag, FP32, FP128>;
def LDXR : UnaryRR <"ldxr", 0x25, null_frag, FP64, FP128>;
let isAsmParserOnly = 1 in {
def LRER : UnaryRR <"lrer", 0x35, null_frag, FP32, FP64>;
def LRDR : UnaryRR <"lrdr", 0x25, null_frag, FP64, FP128>;
}
// Extend floating-point values to wider representations.
def LDER : UnaryRRE<"lder", 0xB324, null_frag, FP64, FP32>;
def LXER : UnaryRRE<"lxer", 0xB326, null_frag, FP128, FP32>;
def LXDR : UnaryRRE<"lxdr", 0xB325, null_frag, FP128, FP64>;
def LDE : UnaryRXE<"lde", 0xED24, null_frag, FP64, 4>;
def LXE : UnaryRXE<"lxe", 0xED26, null_frag, FP128, 4>;
def LXD : UnaryRXE<"lxd", 0xED25, null_frag, FP128, 8>;
// Convert a signed integer register value to a floating-point one.
def CEFR : UnaryRRE<"cefr", 0xB3B4, null_frag, FP32, GR32>;
def CDFR : UnaryRRE<"cdfr", 0xB3B5, null_frag, FP64, GR32>;
def CXFR : UnaryRRE<"cxfr", 0xB3B6, null_frag, FP128, GR32>;
def CEGR : UnaryRRE<"cegr", 0xB3C4, null_frag, FP32, GR64>;
def CDGR : UnaryRRE<"cdgr", 0xB3C5, null_frag, FP64, GR64>;
def CXGR : UnaryRRE<"cxgr", 0xB3C6, null_frag, FP128, GR64>;
// Convert a floating-point register value to a signed integer value,
// with the second operand (modifier M3) specifying the rounding mode.
let Defs = [CC] in {
def CFER : BinaryRRFe<"cfer", 0xB3B8, GR32, FP32>;
def CFDR : BinaryRRFe<"cfdr", 0xB3B9, GR32, FP64>;
def CFXR : BinaryRRFe<"cfxr", 0xB3BA, GR32, FP128>;
def CGER : BinaryRRFe<"cger", 0xB3C8, GR64, FP32>;
def CGDR : BinaryRRFe<"cgdr", 0xB3C9, GR64, FP64>;
def CGXR : BinaryRRFe<"cgxr", 0xB3CA, GR64, FP128>;
}
// Convert BFP to HFP.
let Defs = [CC] in {
def THDER : UnaryRRE<"thder", 0xB358, null_frag, FP64, FP32>;
def THDR : UnaryRRE<"thdr", 0xB359, null_frag, FP64, FP64>;
}
// Convert HFP to BFP.
let Defs = [CC] in {
def TBEDR : BinaryRRFe<"tbedr", 0xB350, FP32, FP64>;
def TBDR : BinaryRRFe<"tbdr", 0xB351, FP64, FP64>;
}
//===----------------------------------------------------------------------===//
// Unary arithmetic
//===----------------------------------------------------------------------===//
// Negation (Load Complement).
let Defs = [CC] in {
def LCER : UnaryRR <"lcer", 0x33, null_frag, FP32, FP32>;
def LCDR : UnaryRR <"lcdr", 0x23, null_frag, FP64, FP64>;
def LCXR : UnaryRRE<"lcxr", 0xB363, null_frag, FP128, FP128>;
}
// Absolute value (Load Positive).
let Defs = [CC] in {
def LPER : UnaryRR <"lper", 0x30, null_frag, FP32, FP32>;
def LPDR : UnaryRR <"lpdr", 0x20, null_frag, FP64, FP64>;
def LPXR : UnaryRRE<"lpxr", 0xB360, null_frag, FP128, FP128>;
}
// Negative absolute value (Load Negative).
let Defs = [CC] in {
def LNER : UnaryRR <"lner", 0x31, null_frag, FP32, FP32>;
def LNDR : UnaryRR <"lndr", 0x21, null_frag, FP64, FP64>;
def LNXR : UnaryRRE<"lnxr", 0xB361, null_frag, FP128, FP128>;
}
// Halve.
def HER : UnaryRR <"her", 0x34, null_frag, FP32, FP32>;
def HDR : UnaryRR <"hdr", 0x24, null_frag, FP64, FP64>;
// Square root.
def SQER : UnaryRRE<"sqer", 0xB245, null_frag, FP32, FP32>;
def SQDR : UnaryRRE<"sqdr", 0xB244, null_frag, FP64, FP64>;
def SQXR : UnaryRRE<"sqxr", 0xB336, null_frag, FP128, FP128>;
def SQE : UnaryRXE<"sqe", 0xED34, null_frag, FP32, 4>;
def SQD : UnaryRXE<"sqd", 0xED35, null_frag, FP64, 8>;
// Round to an integer (rounding towards zero).
def FIER : UnaryRRE<"fier", 0xB377, null_frag, FP32, FP32>;
def FIDR : UnaryRRE<"fidr", 0xB37F, null_frag, FP64, FP64>;
def FIXR : UnaryRRE<"fixr", 0xB367, null_frag, FP128, FP128>;
//===----------------------------------------------------------------------===//
// Binary arithmetic
//===----------------------------------------------------------------------===//
// Addition.
let Defs = [CC] in {
let isCommutable = 1 in {
def AER : BinaryRR<"aer", 0x3A, null_frag, FP32, FP32>;
def ADR : BinaryRR<"adr", 0x2A, null_frag, FP64, FP64>;
def AXR : BinaryRR<"axr", 0x36, null_frag, FP128, FP128>;
}
def AE : BinaryRX<"ae", 0x7A, null_frag, FP32, load, 4>;
def AD : BinaryRX<"ad", 0x6A, null_frag, FP64, load, 8>;
}
// Addition (unnormalized).
let Defs = [CC] in {
let isCommutable = 1 in {
def AUR : BinaryRR<"aur", 0x3E, null_frag, FP32, FP32>;
def AWR : BinaryRR<"awr", 0x2E, null_frag, FP64, FP64>;
}
def AU : BinaryRX<"au", 0x7E, null_frag, FP32, load, 4>;
def AW : BinaryRX<"aw", 0x6E, null_frag, FP64, load, 8>;
}
// Subtraction.
let Defs = [CC] in {
def SER : BinaryRR<"ser", 0x3B, null_frag, FP32, FP32>;
def SDR : BinaryRR<"sdr", 0x2B, null_frag, FP64, FP64>;
def SXR : BinaryRR<"sxr", 0x37, null_frag, FP128, FP128>;
def SE : BinaryRX<"se", 0x7B, null_frag, FP32, load, 4>;
def SD : BinaryRX<"sd", 0x6B, null_frag, FP64, load, 8>;
}
// Subtraction (unnormalized).
let Defs = [CC] in {
def SUR : BinaryRR<"sur", 0x3F, null_frag, FP32, FP32>;
def SWR : BinaryRR<"swr", 0x2F, null_frag, FP64, FP64>;
def SU : BinaryRX<"su", 0x7F, null_frag, FP32, load, 4>;
def SW : BinaryRX<"sw", 0x6F, null_frag, FP64, load, 8>;
}
// Multiplication.
let isCommutable = 1 in {
def MEER : BinaryRRE<"meer", 0xB337, null_frag, FP32, FP32>;
def MDR : BinaryRR <"mdr", 0x2C, null_frag, FP64, FP64>;
def MXR : BinaryRR <"mxr", 0x26, null_frag, FP128, FP128>;
}
def MEE : BinaryRXE<"mee", 0xED37, null_frag, FP32, load, 4>;
def MD : BinaryRX <"md", 0x6C, null_frag, FP64, load, 8>;
// Extending multiplication (f32 x f32 -> f64).
def MDER : BinaryRR<"mder", 0x3C, null_frag, FP64, FP32>;
def MDE : BinaryRX<"mde", 0x7C, null_frag, FP64, load, 4>;
let isAsmParserOnly = 1 in {
def MER : BinaryRR<"mer", 0x3C, null_frag, FP64, FP32>;
def ME : BinaryRX<"me", 0x7C, null_frag, FP64, load, 4>;
}
// Extending multiplication (f64 x f64 -> f128).
def MXDR : BinaryRR<"mxdr", 0x27, null_frag, FP128, FP64>;
def MXD : BinaryRX<"mxd", 0x67, null_frag, FP128, load, 8>;
// Fused multiply-add.
def MAER : TernaryRRD<"maer", 0xB32E, null_frag, FP32, FP32>;
def MADR : TernaryRRD<"madr", 0xB33E, null_frag, FP64, FP64>;
def MAE : TernaryRXF<"mae", 0xED2E, null_frag, FP32, FP32, load, 4>;
def MAD : TernaryRXF<"mad", 0xED3E, null_frag, FP64, FP64, load, 8>;
// Fused multiply-subtract.
def MSER : TernaryRRD<"mser", 0xB32F, null_frag, FP32, FP32>;
def MSDR : TernaryRRD<"msdr", 0xB33F, null_frag, FP64, FP64>;
def MSE : TernaryRXF<"mse", 0xED2F, null_frag, FP32, FP32, load, 4>;
def MSD : TernaryRXF<"msd", 0xED3F, null_frag, FP64, FP64, load, 8>;
// Multiplication (unnormalized).
def MYR : BinaryRRD<"myr", 0xB33B, null_frag, FP128, FP64>;
def MYHR : BinaryRRD<"myhr", 0xB33D, null_frag, FP64, FP64>;
def MYLR : BinaryRRD<"mylr", 0xB339, null_frag, FP64, FP64>;
def MY : BinaryRXF<"my", 0xED3B, null_frag, FP128, FP64, load, 8>;
def MYH : BinaryRXF<"myh", 0xED3D, null_frag, FP64, FP64, load, 8>;
def MYL : BinaryRXF<"myl", 0xED39, null_frag, FP64, FP64, load, 8>;
// Fused multiply-add (unnormalized).
def MAYR : TernaryRRD<"mayr", 0xB33A, null_frag, FP128, FP64>;
def MAYHR : TernaryRRD<"mayhr", 0xB33C, null_frag, FP64, FP64>;
def MAYLR : TernaryRRD<"maylr", 0xB338, null_frag, FP64, FP64>;
def MAY : TernaryRXF<"may", 0xED3A, null_frag, FP128, FP64, load, 8>;
def MAYH : TernaryRXF<"mayh", 0xED3C, null_frag, FP64, FP64, load, 8>;
def MAYL : TernaryRXF<"mayl", 0xED38, null_frag, FP64, FP64, load, 8>;
// Division.
def DER : BinaryRR <"der", 0x3D, null_frag, FP32, FP32>;
def DDR : BinaryRR <"ddr", 0x2D, null_frag, FP64, FP64>;
def DXR : BinaryRRE<"dxr", 0xB22D, null_frag, FP128, FP128>;
def DE : BinaryRX <"de", 0x7D, null_frag, FP32, load, 4>;
def DD : BinaryRX <"dd", 0x6D, null_frag, FP64, load, 8>;
//===----------------------------------------------------------------------===//
// Comparisons
//===----------------------------------------------------------------------===//
let Defs = [CC] in {
def CER : CompareRR <"cer", 0x39, null_frag, FP32, FP32>;
def CDR : CompareRR <"cdr", 0x29, null_frag, FP64, FP64>;
def CXR : CompareRRE<"cxr", 0xB369, null_frag, FP128, FP128>;
def CE : CompareRX<"ce", 0x79, null_frag, FP32, load, 4>;
def CD : CompareRX<"cd", 0x69, null_frag, FP64, load, 8>;
}

@ -908,6 +908,238 @@ def : InstRW<[FXa, Lat30, GroupAlone], (instregex "SFASR$")>;
def : InstRW<[FXa, LSU, Lat30, GroupAlone], (instregex "LFAS$")>;
def : InstRW<[FXb, Lat3, GroupAlone], (instregex "SRNM(B|T)?$")>;
// --------------------- Hexadecimal floating point ------------------------- //
//===----------------------------------------------------------------------===//
// HFP: Move instructions
//===----------------------------------------------------------------------===//
// Load and Test
def : InstRW<[VecXsPm, Lat4], (instregex "LT(D|E)R$")>;
def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "LTXR$")>;
//===----------------------------------------------------------------------===//
// HFP: Conversion instructions
//===----------------------------------------------------------------------===//
// Load rounded
def : InstRW<[VecBF], (instregex "(LEDR|LRER)$")>;
def : InstRW<[VecBF], (instregex "LEXR$")>;
def : InstRW<[VecDF2, VecDF2], (instregex "(LDXR|LRDR)$")>;
// Load lengthened
def : InstRW<[LSU], (instregex "LDE$")>;
def : InstRW<[FXb], (instregex "LDER$")>;
def : InstRW<[VecBF2, VecBF2, LSU, Lat12, GroupAlone], (instregex "LX(D|E)$")>;
def : InstRW<[VecBF2, VecBF2, GroupAlone], (instregex "LX(D|E)R$")>;
// Convert from fixed
def : InstRW<[FXb, VecBF, Lat9, BeginGroup], (instregex "CE(F|G)R$")>;
def : InstRW<[FXb, VecBF, Lat9, BeginGroup], (instregex "CD(F|G)R$")>;
def : InstRW<[FXb, VecDF2, VecDF2, Lat12, GroupAlone], (instregex "CX(F|G)R$")>;
// Convert to fixed
def : InstRW<[FXb, VecBF, Lat11, BeginGroup], (instregex "CF(E|D)R$")>;
def : InstRW<[FXb, VecBF, Lat11, BeginGroup], (instregex "CG(E|D)R$")>;
def : InstRW<[FXb, VecDF, VecDF, Lat20, BeginGroup], (instregex "C(F|G)XR$")>;
// Convert BFP to HFP / HFP to BFP.
def : InstRW<[VecBF], (instregex "THD(E)?R$")>;
def : InstRW<[VecBF], (instregex "TB(E)?DR$")>;
//===----------------------------------------------------------------------===//
// HFP: Unary arithmetic
//===----------------------------------------------------------------------===//
// Load Complement / Negative / Positive
def : InstRW<[VecXsPm, Lat4], (instregex "L(C|N|P)DR$")>;
def : InstRW<[VecXsPm, Lat4], (instregex "L(C|N|P)ER$")>;
def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "L(C|N|P)XR$")>;
// Halve
def : InstRW<[VecBF], (instregex "H(E|D)R$")>;
// Square root
def : InstRW<[VecFPd, LSU], (instregex "SQ(E|D)$")>;
def : InstRW<[VecFPd], (instregex "SQ(E|D)R$")>;
def : InstRW<[VecFPd, VecFPd, GroupAlone], (instregex "SQXR$")>;
// Load FP integer
def : InstRW<[VecBF], (instregex "FIER$")>;
def : InstRW<[VecBF], (instregex "FIDR$")>;
def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "FIXR$")>;
//===----------------------------------------------------------------------===//
// HFP: Binary arithmetic
//===----------------------------------------------------------------------===//
// Addition
def : InstRW<[VecBF, LSU, Lat12], (instregex "A(E|D|U|W)$")>;
def : InstRW<[VecBF], (instregex "A(E|D|U|W)R$")>;
def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "AXR$")>;
// Subtraction
def : InstRW<[VecBF, LSU, Lat12], (instregex "S(E|D|U|W)$")>;
def : InstRW<[VecBF], (instregex "S(E|D|U|W)R$")>;
def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "SXR$")>;
// Multiply
def : InstRW<[VecBF, LSU, Lat12], (instregex "M(D|DE|E|EE)$")>;
def : InstRW<[VecBF], (instregex "M(D|DE|E|EE)R$")>;
def : InstRW<[VecBF2, VecBF2, LSU, Lat12, GroupAlone], (instregex "MXD$")>;
def : InstRW<[VecBF2, VecBF2, GroupAlone], (instregex "MXDR$")>;
def : InstRW<[VecDF2, VecDF2, Lat20, GroupAlone], (instregex "MXR$")>;
def : InstRW<[VecBF2, VecBF2, LSU, Lat12, GroupAlone], (instregex "MY(H|L)?$")>;
def : InstRW<[VecBF2, VecBF2, GroupAlone], (instregex "MY(H|L)?R$")>;
// Multiply and add / subtract
def : InstRW<[VecBF, LSU, Lat12, GroupAlone], (instregex "M(A|S)E$")>;
def : InstRW<[VecBF, GroupAlone], (instregex "M(A|S)ER$")>;
def : InstRW<[VecBF, LSU, Lat12, GroupAlone], (instregex "M(A|S)D$")>;
def : InstRW<[VecBF], (instregex "M(A|S)DR$")>;
def : InstRW<[VecBF2, VecBF2, LSU, Lat12, GroupAlone], (instregex "MAY(H|L)?$")>;
def : InstRW<[VecBF2, VecBF2, GroupAlone], (instregex "MAY(H|L)?R$")>;
// Division
def : InstRW<[VecFPd, LSU], (instregex "D(E|D)$")>;
def : InstRW<[VecFPd], (instregex "D(E|D)R$")>;
def : InstRW<[VecFPd, VecFPd, GroupAlone], (instregex "DXR$")>;
//===----------------------------------------------------------------------===//
// HFP: Comparisons
//===----------------------------------------------------------------------===//
// Compare
def : InstRW<[VecXsPm, LSU, Lat8], (instregex "C(E|D)$")>;
def : InstRW<[VecXsPm, Lat4], (instregex "C(E|D)R$")>;
def : InstRW<[VecDF, VecDF, Lat20, GroupAlone], (instregex "CXR$")>;
// ------------------------ Decimal floating point -------------------------- //
//===----------------------------------------------------------------------===//
// DFP: Move instructions
//===----------------------------------------------------------------------===//
// Load and Test
def : InstRW<[VecDF], (instregex "LTDTR$")>;
def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "LTXTR$")>;
//===----------------------------------------------------------------------===//
// DFP: Conversion instructions
//===----------------------------------------------------------------------===//
// Load rounded
def : InstRW<[VecDF, Lat15], (instregex "LEDTR$")>;
def : InstRW<[VecDF, VecDF, Lat20], (instregex "LDXTR$")>;
// Load lengthened
def : InstRW<[VecDF], (instregex "LDETR$")>;
def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "LXDTR$")>;
// Convert from fixed / logical
def : InstRW<[FXb, VecDF, Lat30, BeginGroup], (instregex "CD(F|G)TR(A)?$")>;
def : InstRW<[FXb, VecDF2, VecDF2, Lat30, GroupAlone], (instregex "CX(F|G)TR(A)?$")>;
def : InstRW<[FXb, VecDF, Lat30, BeginGroup], (instregex "CDL(F|G)TR$")>;
def : InstRW<[FXb, VecDF2, VecDF2, Lat30, GroupAlone], (instregex "CXL(F|G)TR$")>;
// Convert to fixed / logical
def : InstRW<[FXb, VecDF, Lat30, BeginGroup], (instregex "C(F|G)DTR(A)?$")>;
def : InstRW<[FXb, VecDF, VecDF, Lat30, BeginGroup], (instregex "C(F|G)XTR(A)?$")>;
def : InstRW<[FXb, VecDF, Lat30, BeginGroup], (instregex "CL(F|G)DTR$")>;
def : InstRW<[FXb, VecDF, VecDF, Lat30, BeginGroup], (instregex "CL(F|G)XTR$")>;
// Convert from / to signed / unsigned packed
def : InstRW<[FXb, VecDF, Lat9, BeginGroup], (instregex "CD(S|U)TR$")>;
def : InstRW<[FXb, FXb, VecDF2, VecDF2, Lat15, GroupAlone], (instregex "CX(S|U)TR$")>;
def : InstRW<[FXb, VecDF, Lat12, BeginGroup], (instregex "C(S|U)DTR$")>;
def : InstRW<[FXb, FXb, VecDF2, VecDF2, Lat15, BeginGroup], (instregex "C(S|U)XTR$")>;
// Convert from / to zoned
def : InstRW<[LSU, VecDF, Lat11, BeginGroup], (instregex "CDZT$")>;
def : InstRW<[LSU, LSU, VecDF2, VecDF2, Lat15, GroupAlone], (instregex "CXZT$")>;
def : InstRW<[FXb, LSU, VecDF, Lat11, BeginGroup], (instregex "CZDT$")>;
def : InstRW<[FXb, LSU, VecDF, VecDF, Lat15, GroupAlone], (instregex "CZXT$")>;
// Convert from / to packed
def : InstRW<[LSU, VecDF, Lat11, BeginGroup], (instregex "CDPT$")>;
def : InstRW<[LSU, LSU, VecDF2, VecDF2, Lat15, GroupAlone], (instregex "CXPT$")>;
def : InstRW<[FXb, LSU, VecDF, Lat11, BeginGroup], (instregex "CPDT$")>;
def : InstRW<[FXb, LSU, VecDF, VecDF, Lat15, GroupAlone], (instregex "CPXT$")>;
// Perform floating-point operation
def : InstRW<[LSU, Lat30, GroupAlone], (instregex "PFPO$")>;
//===----------------------------------------------------------------------===//
// DFP: Unary arithmetic
//===----------------------------------------------------------------------===//
// Load FP integer
def : InstRW<[VecDF], (instregex "FIDTR$")>;
def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "FIXTR$")>;
// Extract biased exponent
def : InstRW<[FXb, VecDF, Lat12, BeginGroup], (instregex "EEDTR$")>;
def : InstRW<[FXb, VecDF, Lat12, BeginGroup], (instregex "EEXTR$")>;
// Extract significance
def : InstRW<[FXb, VecDF, Lat12, BeginGroup], (instregex "ESDTR$")>;
def : InstRW<[FXb, VecDF, VecDF, Lat15, BeginGroup], (instregex "ESXTR$")>;
//===----------------------------------------------------------------------===//
// DFP: Binary arithmetic
//===----------------------------------------------------------------------===//
// Addition
def : InstRW<[VecDF], (instregex "ADTR(A)?$")>;
def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "AXTR(A)?$")>;
// Subtraction
def : InstRW<[VecDF], (instregex "SDTR(A)?$")>;
def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "SXTR(A)?$")>;
// Multiply
def : InstRW<[VecDF, Lat30], (instregex "MDTR(A)?$")>;
def : InstRW<[VecDF2, VecDF2, Lat30, GroupAlone], (instregex "MXTR(A)?$")>;
// Division
def : InstRW<[VecDF, Lat30], (instregex "DDTR(A)?$")>;
def : InstRW<[VecDF2, VecDF2, Lat30, GroupAlone], (instregex "DXTR(A)?$")>;
// Quantize
def : InstRW<[VecDF], (instregex "QADTR$")>;
def : InstRW<[VecDF2, VecDF2, Lat11, GroupAlone], (instregex "QAXTR$")>;
// Reround
def : InstRW<[FXb, VecDF, Lat11], (instregex "RRDTR$")>;
def : InstRW<[FXb, VecDF2, VecDF2, Lat15, GroupAlone], (instregex "RRXTR$")>;
// Shift significand left/right
def : InstRW<[LSU, VecDF, Lat11], (instregex "S(L|R)DT$")>;
def : InstRW<[LSU, VecDF2, VecDF2, Lat15, GroupAlone], (instregex "S(L|R)XT$")>;
// Insert biased exponent
def : InstRW<[FXb, VecDF, Lat11], (instregex "IEDTR$")>;
def : InstRW<[FXb, VecDF2, VecDF2, Lat15, GroupAlone], (instregex "IEXTR$")>;
//===----------------------------------------------------------------------===//
// DFP: Comparisons
//===----------------------------------------------------------------------===//
// Compare
def : InstRW<[VecDF], (instregex "(K|C)DTR$")>;
def : InstRW<[VecDF, VecDF, Lat11, GroupAlone], (instregex "(K|C)XTR$")>;
// Compare biased exponent
def : InstRW<[VecDF], (instregex "CEDTR$")>;
def : InstRW<[VecDF], (instregex "CEXTR$")>;
// Test Data Class/Group
def : InstRW<[LSU, VecDF, Lat11], (instregex "TD(C|G)(E|D)T$")>;
def : InstRW<[LSU, VecDF2, VecDF2, Lat15, GroupAlone], (instregex "TD(C|G)XT$")>;
// --------------------------------- Vector --------------------------------- //
//===----------------------------------------------------------------------===//

@ -839,5 +839,224 @@ def : InstRW<[FXU, Lat30, GroupAlone], (instregex "SFASR$")>;
def : InstRW<[FXU, LSU, Lat30, GroupAlone], (instregex "LFAS$")>;
def : InstRW<[FXU, Lat2, GroupAlone], (instregex "SRNM(B|T)?$")>;
// --------------------- Hexadecimal floating point ------------------------- //
//===----------------------------------------------------------------------===//
// HFP: Move instructions
//===----------------------------------------------------------------------===//
// Load and Test
def : InstRW<[FPU], (instregex "LT(D|E)R$")>;
def : InstRW<[FPU2, FPU2, Lat9, GroupAlone], (instregex "LTXR$")>;
//===----------------------------------------------------------------------===//
// HFP: Conversion instructions
//===----------------------------------------------------------------------===//
// Load rounded
def : InstRW<[FPU], (instregex "(LEDR|LRER)$")>;
def : InstRW<[FPU], (instregex "LEXR$")>;
def : InstRW<[FPU], (instregex "(LDXR|LRDR)$")>;
// Load lengthened
def : InstRW<[LSU], (instregex "LDE$")>;
def : InstRW<[FXU], (instregex "LDER$")>;
def : InstRW<[FPU2, FPU2, LSU, Lat15, GroupAlone], (instregex "LX(D|E)$")>;
def : InstRW<[FPU2, FPU2, Lat10, GroupAlone], (instregex "LX(D|E)R$")>;
// Convert from fixed
def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CE(F|G)R$")>;
def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CD(F|G)R$")>;
def : InstRW<[FXU, FPU2, FPU2, Lat11, GroupAlone], (instregex "CX(F|G)R$")>;
// Convert to fixed
def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CF(E|D)R$")>;
def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CG(E|D)R$")>;
def : InstRW<[FXU, FPU, FPU, Lat20, GroupAlone], (instregex "C(F|G)XR$")>;
// Convert BFP to HFP / HFP to BFP.
def : InstRW<[FPU], (instregex "THD(E)?R$")>;
def : InstRW<[FPU], (instregex "TB(E)?DR$")>;
//===----------------------------------------------------------------------===//
// HFP: Unary arithmetic
//===----------------------------------------------------------------------===//
// Load Complement / Negative / Positive
def : InstRW<[FPU], (instregex "L(C|N|P)DR$")>;
def : InstRW<[FPU], (instregex "L(C|N|P)ER$")>;
def : InstRW<[FPU2, FPU2, Lat9, GroupAlone], (instregex "L(C|N|P)XR$")>;
// Halve
def : InstRW<[FPU], (instregex "H(E|D)R$")>;
// Square root
def : InstRW<[FPU, LSU, Lat30], (instregex "SQ(E|D)$")>;
def : InstRW<[FPU, Lat30], (instregex "SQ(E|D)R$")>;
def : InstRW<[FPU2, FPU2, Lat30, GroupAlone], (instregex "SQXR$")>;
// Load FP integer
def : InstRW<[FPU], (instregex "FIER$")>;
def : InstRW<[FPU], (instregex "FIDR$")>;
def : InstRW<[FPU2, FPU2, Lat15, GroupAlone], (instregex "FIXR$")>;
//===----------------------------------------------------------------------===//
// HFP: Binary arithmetic
//===----------------------------------------------------------------------===//
// Addition
def : InstRW<[FPU, LSU, Lat12], (instregex "A(E|D|U|W)$")>;
def : InstRW<[FPU], (instregex "A(E|D|U|W)R$")>;
def : InstRW<[FPU2, FPU2, Lat20, GroupAlone], (instregex "AXR$")>;
// Subtraction
def : InstRW<[FPU, LSU, Lat12], (instregex "S(E|D|U|W)$")>;
def : InstRW<[FPU], (instregex "S(E|D|U|W)R$")>;
def : InstRW<[FPU2, FPU2, Lat20, GroupAlone], (instregex "SXR$")>;
// Multiply
def : InstRW<[FPU, LSU, Lat12], (instregex "M(D|DE|E|EE)$")>;
def : InstRW<[FPU], (instregex "M(D|DE|E|EE)R$")>;
def : InstRW<[FPU2, FPU2, LSU, Lat15, GroupAlone], (instregex "MXD$")>;
def : InstRW<[FPU2, FPU2, Lat10, GroupAlone], (instregex "MXDR$")>;
def : InstRW<[FPU2, FPU2, Lat30, GroupAlone], (instregex "MXR$")>;
def : InstRW<[FPU2, FPU2, LSU, Lat15, GroupAlone], (instregex "MY(H|L)?$")>;
def : InstRW<[FPU2, FPU2, Lat10, GroupAlone], (instregex "MY(H|L)?R$")>;
// Multiply and add / subtract
def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A|S)E$")>;
def : InstRW<[FPU, GroupAlone], (instregex "M(A|S)ER$")>;
def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A|S)D$")>;
def : InstRW<[FPU, GroupAlone], (instregex "M(A|S)DR$")>;
def : InstRW<[FPU2, FPU2, LSU, Lat12, GroupAlone], (instregex "MAY(H|L)?$")>;
def : InstRW<[FPU2, FPU2, GroupAlone], (instregex "MAY(H|L)?R$")>;
// Division
def : InstRW<[FPU, LSU, Lat30], (instregex "D(E|D)$")>;
def : InstRW<[FPU, Lat30], (instregex "D(E|D)R$")>;
def : InstRW<[FPU2, FPU2, Lat30, GroupAlone], (instregex "DXR$")>;
//===----------------------------------------------------------------------===//
// HFP: Comparisons
//===----------------------------------------------------------------------===//
// Compare
def : InstRW<[FPU, LSU, Lat12], (instregex "C(E|D)$")>;
def : InstRW<[FPU], (instregex "C(E|D)R$")>;
def : InstRW<[FPU, FPU, Lat15], (instregex "CXR$")>;
// ------------------------ Decimal floating point -------------------------- //
//===----------------------------------------------------------------------===//
// DFP: Move instructions
//===----------------------------------------------------------------------===//
// Load and Test
def : InstRW<[DFU, Lat20], (instregex "LTDTR$")>;
def : InstRW<[DFU2, DFU2, Lat20, GroupAlone], (instregex "LTXTR$")>;
//===----------------------------------------------------------------------===//
// DFP: Conversion instructions
//===----------------------------------------------------------------------===//
// Load rounded
def : InstRW<[DFU, Lat30], (instregex "LEDTR$")>;
def : InstRW<[DFU, DFU, Lat30], (instregex "LDXTR$")>;
// Load lengthened
def : InstRW<[DFU, Lat20], (instregex "LDETR$")>;
def : InstRW<[DFU2, DFU2, Lat20, GroupAlone], (instregex "LXDTR$")>;
// Convert from fixed / logical
def : InstRW<[FXU, DFU, Lat30, GroupAlone], (instregex "CD(F|G)TR(A)?$")>;
def : InstRW<[FXU, DFU2, DFU2, Lat30, GroupAlone], (instregex "CX(F|G)TR(A)?$")>;
def : InstRW<[FXU, DFU, Lat11, GroupAlone], (instregex "CDL(F|G)TR$")>;
def : InstRW<[FXU, DFU2, DFU2, Lat11, GroupAlone], (instregex "CXL(F|G)TR$")>;
// Convert to fixed / logical
def : InstRW<[FXU, DFU, Lat30, GroupAlone], (instregex "C(F|G)DTR(A)?$")>;
def : InstRW<[FXU, DFU, DFU, Lat30, GroupAlone], (instregex "C(F|G)XTR(A)?$")>;
def : InstRW<[FXU, DFU, Lat30, GroupAlone], (instregex "CL(F|G)DTR$")>;
def : InstRW<[FXU, DFU, DFU, Lat30, GroupAlone], (instregex "CL(F|G)XTR$")>;
// Convert from / to signed / unsigned packed
def : InstRW<[FXU, DFU, Lat12, GroupAlone], (instregex "CD(S|U)TR$")>;
def : InstRW<[FXU, FXU, DFU2, DFU2, Lat20, GroupAlone], (instregex "CX(S|U)TR$")>;
def : InstRW<[FXU, DFU, Lat12, GroupAlone], (instregex "C(S|U)DTR$")>;
def : InstRW<[FXU, FXU, DFU2, DFU2, Lat20, GroupAlone], (instregex "C(S|U)XTR$")>;
// Perform floating-point operation
def : InstRW<[LSU, Lat30, GroupAlone], (instregex "PFPO$")>;
//===----------------------------------------------------------------------===//
// DFP: Unary arithmetic
//===----------------------------------------------------------------------===//
// Load FP integer
def : InstRW<[DFU, Lat20], (instregex "FIDTR$")>;
def : InstRW<[DFU2, DFU2, Lat20, GroupAlone], (instregex "FIXTR$")>;
// Extract biased exponent
def : InstRW<[FXU, DFU, Lat15, GroupAlone], (instregex "EEDTR$")>;
def : InstRW<[FXU, DFU, Lat15, GroupAlone], (instregex "EEXTR$")>;
// Extract significance
def : InstRW<[FXU, DFU, Lat15, GroupAlone], (instregex "ESDTR$")>;
def : InstRW<[FXU, DFU, DFU, Lat20, GroupAlone], (instregex "ESXTR$")>;
//===----------------------------------------------------------------------===//
// DFP: Binary arithmetic
//===----------------------------------------------------------------------===//
// Addition
def : InstRW<[DFU, Lat30], (instregex "ADTR(A)?$")>;
def : InstRW<[DFU2, DFU2, Lat30, GroupAlone], (instregex "AXTR(A)?$")>;
// Subtraction
def : InstRW<[DFU, Lat30], (instregex "SDTR(A)?$")>;
def : InstRW<[DFU2, DFU2, Lat30, GroupAlone], (instregex "SXTR(A)?$")>;
// Multiply
def : InstRW<[DFU, Lat30], (instregex "MDTR(A)?$")>;
def : InstRW<[DFU2, DFU2, Lat30, GroupAlone], (instregex "MXTR(A)?$")>;
// Division
def : InstRW<[DFU, Lat30], (instregex "DDTR(A)?$")>;
def : InstRW<[DFU2, DFU2, Lat30, GroupAlone], (instregex "DXTR(A)?$")>;
// Quantize
def : InstRW<[DFU, Lat30], (instregex "QADTR$")>;
def : InstRW<[DFU2, DFU2, Lat30, GroupAlone], (instregex "QAXTR$")>;
// Reround
def : InstRW<[FXU, DFU, Lat30], (instregex "RRDTR$")>;
def : InstRW<[FXU, DFU2, DFU2, Lat30, GroupAlone], (instregex "RRXTR$")>;
// Shift significand left/right
def : InstRW<[LSU, DFU, Lat11], (instregex "S(L|R)DT$")>;
def : InstRW<[LSU, DFU2, DFU2, Lat15, GroupAlone], (instregex "S(L|R)XT$")>;
// Insert biased exponent
def : InstRW<[FXU, DFU, Lat11], (instregex "IEDTR$")>;
def : InstRW<[FXU, DFU2, DFU2, Lat15, GroupAlone], (instregex "IEXTR$")>;
//===----------------------------------------------------------------------===//
// DFP: Comparisons
//===----------------------------------------------------------------------===//
// Compare
def : InstRW<[DFU, Lat11], (instregex "(K|C)DTR$")>;
def : InstRW<[DFU, DFU, Lat15, GroupAlone], (instregex "(K|C)XTR$")>;
// Compare biased exponent
def : InstRW<[DFU, Lat8], (instregex "CEDTR$")>;
def : InstRW<[DFU, Lat9], (instregex "CEXTR$")>;
// Test Data Class/Group
def : InstRW<[LSU, DFU, Lat15], (instregex "TD(C|G)(E|D)T$")>;
def : InstRW<[LSU, DFU2, DFU2, Lat15, GroupAlone], (instregex "TD(C|G)XT$")>;
}

@ -877,5 +877,230 @@ def : InstRW<[FXU, Lat30, GroupAlone], (instregex "SFASR$")>;
def : InstRW<[FXU, LSU, Lat30, GroupAlone], (instregex "LFAS$")>;
def : InstRW<[FXU, Lat2, GroupAlone], (instregex "SRNM(B|T)?$")>;
// --------------------- Hexadecimal floating point ------------------------- //
//===----------------------------------------------------------------------===//
// HFP: Move instructions
//===----------------------------------------------------------------------===//
// Load and Test
def : InstRW<[FPU], (instregex "LT(D|E)R$")>;
def : InstRW<[FPU2, FPU2, Lat9, GroupAlone], (instregex "LTXR$")>;
//===----------------------------------------------------------------------===//
// HFP: Conversion instructions
//===----------------------------------------------------------------------===//
// Load rounded
def : InstRW<[FPU], (instregex "(LEDR|LRER)$")>;
def : InstRW<[FPU], (instregex "LEXR$")>;
def : InstRW<[FPU], (instregex "(LDXR|LRDR)$")>;
// Load lengthened
def : InstRW<[LSU], (instregex "LDE$")>;
def : InstRW<[FXU], (instregex "LDER$")>;
def : InstRW<[FPU2, FPU2, LSU, Lat15, GroupAlone], (instregex "LX(D|E)$")>;
def : InstRW<[FPU2, FPU2, Lat10, GroupAlone], (instregex "LX(D|E)R$")>;
// Convert from fixed
def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CE(F|G)R$")>;
def : InstRW<[FXU, FPU, Lat9, GroupAlone], (instregex "CD(F|G)R$")>;
def : InstRW<[FXU, FPU2, FPU2, Lat11, GroupAlone], (instregex "CX(F|G)R$")>;
// Convert to fixed
def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CF(E|D)R$")>;
def : InstRW<[FXU, FPU, Lat12, GroupAlone], (instregex "CG(E|D)R$")>;
def : InstRW<[FXU, FPU, FPU, Lat20, GroupAlone], (instregex "C(F|G)XR$")>;
// Convert BFP to HFP / HFP to BFP.
def : InstRW<[FPU], (instregex "THD(E)?R$")>;
def : InstRW<[FPU], (instregex "TB(E)?DR$")>;
//===----------------------------------------------------------------------===//
// HFP: Unary arithmetic
//===----------------------------------------------------------------------===//
// Load Complement / Negative / Positive
def : InstRW<[FPU], (instregex "L(C|N|P)DR$")>;
def : InstRW<[FPU], (instregex "L(C|N|P)ER$")>;
def : InstRW<[FPU2, FPU2, Lat9, GroupAlone], (instregex "L(C|N|P)XR$")>;
// Halve
def : InstRW<[FPU], (instregex "H(E|D)R$")>;
// Square root
def : InstRW<[FPU, LSU, Lat30], (instregex "SQ(E|D)$")>;
def : InstRW<[FPU, Lat30], (instregex "SQ(E|D)R$")>;
def : InstRW<[FPU2, FPU2, Lat30, GroupAlone], (instregex "SQXR$")>;
// Load FP integer
def : InstRW<[FPU], (instregex "FIER$")>;
def : InstRW<[FPU], (instregex "FIDR$")>;
def : InstRW<[FPU2, FPU2, Lat15, GroupAlone], (instregex "FIXR$")>;
//===----------------------------------------------------------------------===//
// HFP: Binary arithmetic
//===----------------------------------------------------------------------===//
// Addition
def : InstRW<[FPU, LSU, Lat12], (instregex "A(E|D|U|W)$")>;
def : InstRW<[FPU], (instregex "A(E|D|U|W)R$")>;
def : InstRW<[FPU2, FPU2, Lat20, GroupAlone], (instregex "AXR$")>;
// Subtraction
def : InstRW<[FPU, LSU, Lat12], (instregex "S(E|D|U|W)$")>;
def : InstRW<[FPU], (instregex "S(E|D|U|W)R$")>;
def : InstRW<[FPU2, FPU2, Lat20, GroupAlone], (instregex "SXR$")>;
// Multiply
def : InstRW<[FPU, LSU, Lat12], (instregex "M(D|DE|E|EE)$")>;
def : InstRW<[FPU], (instregex "M(D|DE|E|EE)R$")>;
def : InstRW<[FPU2, FPU2, LSU, Lat15, GroupAlone], (instregex "MXD$")>;
def : InstRW<[FPU2, FPU2, Lat10, GroupAlone], (instregex "MXDR$")>;
def : InstRW<[FPU2, FPU2, Lat30, GroupAlone], (instregex "MXR$")>;
def : InstRW<[FPU2, FPU2, LSU, Lat15, GroupAlone], (instregex "MY(H|L)?$")>;
def : InstRW<[FPU2, FPU2, Lat10, GroupAlone], (instregex "MY(H|L)?R$")>;
// Multiply and add / subtract
def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A|S)E$")>;
def : InstRW<[FPU, GroupAlone], (instregex "M(A|S)ER$")>;
def : InstRW<[FPU, LSU, Lat12, GroupAlone], (instregex "M(A|S)D$")>;
def : InstRW<[FPU, GroupAlone], (instregex "M(A|S)DR$")>;
def : InstRW<[FPU2, FPU2, LSU, Lat12, GroupAlone], (instregex "MAY(H|L)?$")>;
def : InstRW<[FPU2, FPU2, GroupAlone], (instregex "MAY(H|L)?R$")>;
// Division
def : InstRW<[FPU, LSU, Lat30], (instregex "D(E|D)$")>;
def : InstRW<[FPU, Lat30], (instregex "D(E|D)R$")>;
def : InstRW<[FPU2, FPU2, Lat30, GroupAlone], (instregex "DXR$")>;
//===----------------------------------------------------------------------===//
// HFP: Comparisons
//===----------------------------------------------------------------------===//
// Compare
def : InstRW<[FPU, LSU, Lat12], (instregex "C(E|D)$")>;
def : InstRW<[FPU], (instregex "C(E|D)R$")>;
def : InstRW<[FPU, FPU, Lat15], (instregex "CXR$")>;
// ------------------------ Decimal floating point -------------------------- //
//===----------------------------------------------------------------------===//
// DFP: Move instructions
//===----------------------------------------------------------------------===//
// Load and Test
def : InstRW<[DFU, Lat20], (instregex "LTDTR$")>;
def : InstRW<[DFU2, DFU2, Lat20, GroupAlone], (instregex "LTXTR$")>;
//===----------------------------------------------------------------------===//
// DFP: Conversion instructions
//===----------------------------------------------------------------------===//
// Load rounded
def : InstRW<[DFU, Lat30], (instregex "LEDTR$")>;
def : InstRW<[DFU, DFU, Lat30], (instregex "LDXTR$")>;
// Load lengthened
def : InstRW<[DFU, Lat20], (instregex "LDETR$")>;
def : InstRW<[DFU2, DFU2, Lat20, GroupAlone], (instregex "LXDTR$")>;
// Convert from fixed / logical
def : InstRW<[FXU, DFU, Lat30, GroupAlone], (instregex "CD(F|G)TR(A)?$")>;
def : InstRW<[FXU, DFU2, DFU2, Lat30, GroupAlone], (instregex "CX(F|G)TR(A)?$")>;
def : InstRW<[FXU, DFU, Lat11, GroupAlone], (instregex "CDL(F|G)TR$")>;
def : InstRW<[FXU, DFU2, DFU2, Lat11, GroupAlone], (instregex "CXL(F|G)TR$")>;
// Convert to fixed / logical
def : InstRW<[FXU, DFU, Lat30, GroupAlone], (instregex "C(F|G)DTR(A)?$")>;
def : InstRW<[FXU, DFU, DFU, Lat30, GroupAlone], (instregex "C(F|G)XTR(A)?$")>;
def : InstRW<[FXU, DFU, Lat30, GroupAlone], (instregex "CL(F|G)DTR$")>;
def : InstRW<[FXU, DFU, DFU, Lat30, GroupAlone], (instregex "CL(F|G)XTR$")>;
// Convert from / to signed / unsigned packed
def : InstRW<[FXU, DFU, Lat12, GroupAlone], (instregex "CD(S|U)TR$")>;
def : InstRW<[FXU, FXU, DFU2, DFU2, Lat20, GroupAlone], (instregex "CX(S|U)TR$")>;
def : InstRW<[FXU, DFU, Lat12, GroupAlone], (instregex "C(S|U)DTR$")>;
def : InstRW<[FXU, FXU, DFU2, DFU2, Lat20, GroupAlone], (instregex "C(S|U)XTR$")>;
// Convert from / to zoned
def : InstRW<[LSU, DFU2, Lat7, GroupAlone], (instregex "CDZT$")>;
def : InstRW<[LSU, LSU, DFU2, DFU2, Lat10, GroupAlone], (instregex "CXZT$")>;
def : InstRW<[FXU, LSU, DFU, Lat11, GroupAlone], (instregex "CZDT$")>;
def : InstRW<[FXU, LSU, DFU, DFU, Lat15, GroupAlone], (instregex "CZXT$")>;
// Perform floating-point operation
def : InstRW<[LSU, Lat30, GroupAlone], (instregex "PFPO$")>;
//===----------------------------------------------------------------------===//
// DFP: Unary arithmetic
//===----------------------------------------------------------------------===//
// Load FP integer
def : InstRW<[DFU, Lat20], (instregex "FIDTR$")>;
def : InstRW<[DFU2, DFU2, Lat20, GroupAlone], (instregex "FIXTR$")>;
// Extract biased exponent
def : InstRW<[FXU, DFU, Lat15, GroupAlone], (instregex "EEDTR$")>;
def : InstRW<[FXU, DFU, Lat15, GroupAlone], (instregex "EEXTR$")>;
// Extract significance
def : InstRW<[FXU, DFU, Lat15, GroupAlone], (instregex "ESDTR$")>;
def : InstRW<[FXU, DFU, DFU, Lat20, GroupAlone], (instregex "ESXTR$")>;
//===----------------------------------------------------------------------===//
// DFP: Binary arithmetic
//===----------------------------------------------------------------------===//
// Addition
def : InstRW<[DFU, Lat30], (instregex "ADTR(A)?$")>;
def : InstRW<[DFU2, DFU2, Lat30, GroupAlone], (instregex "AXTR(A)?$")>;
// Subtraction
def : InstRW<[DFU, Lat30], (instregex "SDTR(A)?$")>;
def : InstRW<[DFU2, DFU2, Lat30, GroupAlone], (instregex "SXTR(A)?$")>;
// Multiply
def : InstRW<[DFU, Lat30], (instregex "MDTR(A)?$")>;
def : InstRW<[DFU2, DFU2, Lat30, GroupAlone], (instregex "MXTR(A)?$")>;
// Division
def : InstRW<[DFU, Lat30], (instregex "DDTR(A)?$")>;
def : InstRW<[DFU2, DFU2, Lat30, GroupAlone], (instregex "DXTR(A)?$")>;
// Quantize
def : InstRW<[DFU, Lat30], (instregex "QADTR$")>;
def : InstRW<[DFU2, DFU2, Lat30, GroupAlone], (instregex "QAXTR$")>;
// Reround
def : InstRW<[FXU, DFU, Lat30], (instregex "RRDTR$")>;
def : InstRW<[FXU, DFU2, DFU2, Lat30, GroupAlone], (instregex "RRXTR$")>;
// Shift significand left/right
def : InstRW<[LSU, DFU, Lat11], (instregex "S(L|R)DT$")>;
def : InstRW<[LSU, DFU2, DFU2, Lat15, GroupAlone], (instregex "S(L|R)XT$")>;
// Insert biased exponent
def : InstRW<[FXU, DFU, Lat11], (instregex "IEDTR$")>;
def : InstRW<[FXU, DFU2, DFU2, Lat15, GroupAlone], (instregex "IEXTR$")>;
//===----------------------------------------------------------------------===//
// DFP: Comparisons
//===----------------------------------------------------------------------===//
// Compare
def : InstRW<[DFU, Lat11], (instregex "(K|C)DTR$")>;
def : InstRW<[DFU, DFU, Lat15, GroupAlone], (instregex "(K|C)XTR$")>;
// Compare biased exponent
def : InstRW<[DFU, Lat8], (instregex "CEDTR$")>;
def : InstRW<[DFU, Lat9], (instregex "CEXTR$")>;
// Test Data Class/Group
def : InstRW<[LSU, DFU, Lat15], (instregex "TD(C|G)(E|D)T$")>;
def : InstRW<[LSU, DFU2, DFU2, Lat15, GroupAlone], (instregex "TD(C|G)XT$")>;
}

@ -42,8 +42,10 @@ SystemZSubtarget::SystemZSubtarget(const Triple &TT, const std::string &CPU,
HasMiscellaneousExtensions(false),
HasExecutionHint(false), HasLoadAndTrap(false),
HasTransactionalExecution(false), HasProcessorAssist(false),
HasDFPZonedConversion(false),
HasVector(false), HasLoadStoreOnCond2(false),
HasLoadAndZeroRightmostByte(false), HasMessageSecurityAssist5(false),
HasDFPPackedConversion(false),
TargetTriple(TT), InstrInfo(initializeSubtargetDependencies(CPU, FS)),
TLInfo(TM, *this), TSInfo(), FrameLowering() {}

@ -47,10 +47,12 @@ protected:
bool HasLoadAndTrap;
bool HasTransactionalExecution;
bool HasProcessorAssist;
bool HasDFPZonedConversion;
bool HasVector;
bool HasLoadStoreOnCond2;
bool HasLoadAndZeroRightmostByte;
bool HasMessageSecurityAssist5;
bool HasDFPPackedConversion;
private:
Triple TargetTriple;
@ -133,6 +135,9 @@ public:
// Return true if the target has the processor-assist facility.
bool hasProcessorAssist() const { return HasProcessorAssist; }
// Return true if the target has the DFP zoned-conversion facility.
bool hasDFPZonedConversion() const { return HasDFPZonedConversion; }
// Return true if the target has the load-and-zero-rightmost-byte facility.
bool hasLoadAndZeroRightmostByte() const {
return HasLoadAndZeroRightmostByte;
@ -142,6 +147,9 @@ public:
// extension facility 5.
bool hasMessageSecurityAssist5() const { return HasMessageSecurityAssist5; }
// Return true if the target has the DFP packed-conversion facility.
bool hasDFPPackedConversion() const { return HasDFPPackedConversion; }
// Return true if the target has the vector facility.
bool hasVector() const { return HasVector; }

@ -84,8 +84,8 @@ WebAssemblyTargetLowering::WebAssemblyTargetLowering(
ISD::SETULT, ISD::SETULE, ISD::SETUGT, ISD::SETUGE})
setCondCodeAction(CC, T, Expand);
// Expand floating-point library function operators.
for (auto Op : {ISD::FSIN, ISD::FCOS, ISD::FSINCOS, ISD::FPOWI, ISD::FPOW,
ISD::FREM, ISD::FMA})
for (auto Op : {ISD::FSIN, ISD::FCOS, ISD::FSINCOS, ISD::FPOW, ISD::FREM,
ISD::FMA})
setOperationAction(Op, T, Expand);
// Note supported floating-point library function operators that otherwise
// default to expand.

@ -80,6 +80,12 @@ static cl::opt<int> ExperimentalPrefLoopAlignment(
" of the loop header PC will be 0)."),
cl::Hidden);
static cl::opt<bool> MulConstantOptimization(
"mul-constant-optimization", cl::init(true),
cl::desc("Replace 'mul x, Const' with more effective instructions like "
"SHIFT, LEA, etc."),
cl::Hidden);
/// Call this when the user attempts to do something unsupported, like
/// returning a double without SSE2 enabled on x86_64. This is not fatal, unlike
/// report_fatal_error, so calling code should attempt to recover without
@ -670,7 +676,6 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
setOperationAction(ISD::FSINCOS, VT, Expand);
setOperationAction(ISD::FCOS, VT, Expand);
setOperationAction(ISD::FREM, VT, Expand);
setOperationAction(ISD::FPOWI, VT, Expand);
setOperationAction(ISD::FCOPYSIGN, VT, Expand);
setOperationAction(ISD::FPOW, VT, Expand);
setOperationAction(ISD::FLOG, VT, Expand);
@ -30928,6 +30933,75 @@ static SDValue reduceVMULWidth(SDNode *N, SelectionDAG &DAG,
}
}
static SDValue combineMulSpecial(uint64_t MulAmt, SDNode *N, SelectionDAG &DAG,
EVT VT, SDLoc DL) {
auto combineMulShlAddOrSub = [&](int Mult, int Shift, bool isAdd) {
SDValue Result = DAG.getNode(X86ISD::MUL_IMM, DL, VT, N->getOperand(0),
DAG.getConstant(Mult, DL, VT));
Result = DAG.getNode(ISD::SHL, DL, VT, Result,
DAG.getConstant(Shift, DL, MVT::i8));
Result = DAG.getNode(isAdd ? ISD::ADD : ISD::SUB, DL, VT, N->getOperand(0),
Result);
return Result;
};
auto combineMulMulAddOrSub = [&](bool isAdd) {
SDValue Result = DAG.getNode(X86ISD::MUL_IMM, DL, VT, N->getOperand(0),
DAG.getConstant(9, DL, VT));
Result = DAG.getNode(ISD::MUL, DL, VT, Result, DAG.getConstant(3, DL, VT));
Result = DAG.getNode(isAdd ? ISD::ADD : ISD::SUB, DL, VT, N->getOperand(0),
Result);
return Result;
};
switch (MulAmt) {
default:
break;
case 11:
// mul x, 11 => add ((shl (mul x, 5), 1), x)
return combineMulShlAddOrSub(5, 1, /*isAdd*/ true);
case 21:
// mul x, 21 => add ((shl (mul x, 5), 2), x)
return combineMulShlAddOrSub(5, 2, /*isAdd*/ true);
case 22:
// mul x, 22 => add (add ((shl (mul x, 5), 2), x), x)
return DAG.getNode(ISD::ADD, DL, VT, N->getOperand(0),
combineMulShlAddOrSub(5, 2, /*isAdd*/ true));
case 19:
// mul x, 19 => sub ((shl (mul x, 5), 2), x)
return combineMulShlAddOrSub(5, 2, /*isAdd*/ false);
case 13:
// mul x, 13 => add ((shl (mul x, 3), 2), x)
return combineMulShlAddOrSub(3, 2, /*isAdd*/ true);
case 23:
// mul x, 13 => sub ((shl (mul x, 3), 3), x)
return combineMulShlAddOrSub(3, 3, /*isAdd*/ false);
case 14:
// mul x, 14 => add (add ((shl (mul x, 3), 2), x), x)
return DAG.getNode(ISD::ADD, DL, VT, N->getOperand(0),
combineMulShlAddOrSub(3, 2, /*isAdd*/ true));
case 26:
// mul x, 26 => sub ((mul (mul x, 9), 3), x)
return combineMulMulAddOrSub(/*isAdd*/ false);
case 28:
// mul x, 28 => add ((mul (mul x, 9), 3), x)
return combineMulMulAddOrSub(/*isAdd*/ true);
case 29:
// mul x, 29 => add (add ((mul (mul x, 9), 3), x), x)
return DAG.getNode(ISD::ADD, DL, VT, N->getOperand(0),
combineMulMulAddOrSub(/*isAdd*/ true));
case 30:
// mul x, 30 => sub (sub ((shl x, 5), x), x)
return DAG.getNode(
ISD::SUB, DL, VT, N->getOperand(0),
DAG.getNode(ISD::SUB, DL, VT, N->getOperand(0),
DAG.getNode(ISD::SHL, DL, VT, N->getOperand(0),
DAG.getConstant(5, DL, MVT::i8))));
}
return SDValue();
}
/// Optimize a single multiply with constant into two operations in order to
/// implement it with two cheaper instructions, e.g. LEA + SHL, LEA + LEA.
static SDValue combineMul(SDNode *N, SelectionDAG &DAG,
@ -30937,6 +31011,8 @@ static SDValue combineMul(SDNode *N, SelectionDAG &DAG,
if (DCI.isBeforeLegalize() && VT.isVector())
return reduceVMULWidth(N, DAG, Subtarget);
if (!MulConstantOptimization)
return SDValue();
// An imul is usually smaller than the alternative sequence.
if (DAG.getMachineFunction().getFunction()->optForMinSize())
return SDValue();
@ -30992,7 +31068,8 @@ static SDValue combineMul(SDNode *N, SelectionDAG &DAG,
else
NewMul = DAG.getNode(X86ISD::MUL_IMM, DL, VT, NewMul,
DAG.getConstant(MulAmt2, DL, VT));
}
} else if (!Subtarget.slowLEA())
NewMul = combineMulSpecial(MulAmt, N, DAG, VT, DL);
if (!NewMul) {
assert(MulAmt != 0 &&

@ -377,7 +377,6 @@ private:
int StoreCount = 0;
};
struct HashedExpression;
namespace llvm {
template <> struct DenseMapInfo<const Expression *> {
static const Expression *getEmptyKey() {
@ -391,41 +390,25 @@ template <> struct DenseMapInfo<const Expression *> {
return reinterpret_cast<const Expression *>(Val);
}
static unsigned getHashValue(const Expression *E) {
return static_cast<unsigned>(E->getHashValue());
return static_cast<unsigned>(E->getComputedHash());
}
static unsigned getHashValue(const HashedExpression &HE);
static bool isEqual(const HashedExpression &LHS, const Expression *RHS);
static bool isEqual(const Expression *LHS, const Expression *RHS) {
if (LHS == RHS)
return true;
if (LHS == getTombstoneKey() || RHS == getTombstoneKey() ||
LHS == getEmptyKey() || RHS == getEmptyKey())
return false;
// Compare hashes before equality. This is *not* what the hashtable does,
// since it is computing it modulo the number of buckets, whereas we are
// using the full hash keyspace. Since the hashes are precomputed, this
// check is *much* faster than equality.
if (LHS->getComputedHash() != RHS->getComputedHash())
return false;
return *LHS == *RHS;
}
};
} // end namespace llvm
// This is just a wrapper around Expression that computes the hash value once at
// creation time. Hash values for an Expression can't change once they are
// inserted into the DenseMap (it breaks DenseMap), so they must be immutable at
// that point anyway.
struct HashedExpression {
const Expression *E;
unsigned HashVal;
HashedExpression(const Expression *E)
: E(E), HashVal(DenseMapInfo<const Expression *>::getHashValue(E)) {}
};
unsigned
DenseMapInfo<const Expression *>::getHashValue(const HashedExpression &HE) {
return HE.HashVal;
}
bool DenseMapInfo<const Expression *>::isEqual(const HashedExpression &LHS,
const Expression *RHS) {
return isEqual(LHS.E, RHS);
}
namespace {
class NewGVN {
Function &F;
@ -707,7 +690,7 @@ private:
void markPredicateUsersTouched(Instruction *);
void markValueLeaderChangeTouched(CongruenceClass *CC);
void markMemoryLeaderChangeTouched(CongruenceClass *CC);
void markPhiOfOpsChanged(const HashedExpression &HE);
void markPhiOfOpsChanged(const Expression *E);
void addPredicateUsers(const PredicateBase *, Instruction *) const;
void addMemoryUsers(const MemoryAccess *To, MemoryAccess *U) const;
void addAdditionalUsers(Value *To, Value *User) const;
@ -956,8 +939,12 @@ const Expression *NewGVN::checkSimplificationResults(Expression *E,
if (CC && CC->getDefiningExpr()) {
// If we simplified to something else, we need to communicate
// that we're users of the value we simplified to.
if (I != V)
addAdditionalUsers(V, I);
if (I != V) {
// Don't add temporary instructions to the user lists.
if (!AllTempInstructions.count(I))
addAdditionalUsers(V, I);
}
if (I)
DEBUG(dbgs() << "Simplified " << *I << " to "
<< " expression " << *CC->getDefiningExpr() << "\n");
@ -2195,8 +2182,8 @@ void NewGVN::moveValueToNewCongruenceClass(Instruction *I, const Expression *E,
// For a given expression, mark the phi of ops instructions that could have
// changed as a result.
void NewGVN::markPhiOfOpsChanged(const HashedExpression &HE) {
touchAndErase(ExpressionToPhiOfOps, HE);
void NewGVN::markPhiOfOpsChanged(const Expression *E) {
touchAndErase(ExpressionToPhiOfOps, E);
}
// Perform congruence finding on a given value numbering expression.
@ -2210,14 +2197,13 @@ void NewGVN::performCongruenceFinding(Instruction *I, const Expression *E) {
assert(!IClass->isDead() && "Found a dead class");
CongruenceClass *EClass = nullptr;
HashedExpression HE(E);
if (const auto *VE = dyn_cast<VariableExpression>(E)) {
EClass = ValueToClass.lookup(VE->getVariableValue());
} else if (isa<DeadExpression>(E)) {
EClass = TOPClass;
}
if (!EClass) {
auto lookupResult = ExpressionToClass.insert_as({E, nullptr}, HE);
auto lookupResult = ExpressionToClass.insert({E, nullptr});
// If it's not in the value table, create a new congruence class.
if (lookupResult.second) {
@ -2268,7 +2254,7 @@ void NewGVN::performCongruenceFinding(Instruction *I, const Expression *E) {
<< "\n");
if (ClassChanged) {
moveValueToNewCongruenceClass(I, E, IClass, EClass);
markPhiOfOpsChanged(HE);
markPhiOfOpsChanged(E);
}
markUsersTouched(I);
@ -2502,9 +2488,8 @@ NewGVN::makePossiblePhiOfOps(Instruction *I, bool HasBackedge,
// Clone the instruction, create an expression from it, and see if we
// have a leader.
Instruction *ValueOp = I->clone();
auto Iter = TempToMemory.end();
if (MemAccess)
Iter = TempToMemory.insert({ValueOp, MemAccess}).first;
TempToMemory.insert({ValueOp, MemAccess});
for (auto &Op : ValueOp->operands()) {
Op = Op->DoPHITranslation(PHIBlock, PredBB);
@ -2523,7 +2508,7 @@ NewGVN::makePossiblePhiOfOps(Instruction *I, bool HasBackedge,
AllTempInstructions.erase(ValueOp);
ValueOp->deleteValue();
if (MemAccess)
TempToMemory.erase(Iter);
TempToMemory.erase(ValueOp);
if (!E)
return nullptr;
FoundVal = findPhiOfOpsLeader(E, PredBB);

@ -7173,7 +7173,7 @@ LoopVectorizationCostModel::getInstructionCost(Instruction *I, unsigned VF) {
// Note: Even if all instructions are scalarized, return true if any memory
// accesses appear in the loop to get benefits from address folding etc.
bool TypeNotScalarized =
VF > 1 && VectorTy->isVectorTy() && TTI.getNumberOfParts(VectorTy) < VF;
VF > 1 && !VectorTy->isVoidTy() && TTI.getNumberOfParts(VectorTy) < VF;
return VectorizationCostTy(C, TypeNotScalarized);
}
@ -7312,7 +7312,7 @@ unsigned LoopVectorizationCostModel::getInstructionCost(Instruction *I,
Type *RetTy = I->getType();
if (canTruncateToMinimalBitwidth(I, VF))
RetTy = IntegerType::get(RetTy->getContext(), MinBWs[I]);
VectorTy = isScalarAfterVectorization(I, VF) ? RetTy : ToVectorTy(RetTy, VF);
VectorTy = ToVectorTy(RetTy, VF);
auto SE = PSE.getSE();
// TODO: We need to estimate the cost of intrinsic calls.
@ -7446,9 +7446,8 @@ unsigned LoopVectorizationCostModel::getInstructionCost(Instruction *I,
Op2VK = TargetTransformInfo::OK_UniformValue;
}
SmallVector<const Value *, 4> Operands(I->operand_values());
unsigned N = isScalarAfterVectorization(I, VF) ? VF : 1;
return N * TTI.getArithmeticInstrCost(I->getOpcode(), VectorTy, Op1VK,
Op2VK, Op1VP, Op2VP, Operands);
return TTI.getArithmeticInstrCost(I->getOpcode(), VectorTy, Op1VK,
Op2VK, Op1VP, Op2VP, Operands);
}
case Instruction::Select: {
SelectInst *SI = cast<SelectInst>(I);
@ -7471,15 +7470,7 @@ unsigned LoopVectorizationCostModel::getInstructionCost(Instruction *I,
}
case Instruction::Store:
case Instruction::Load: {
unsigned Width = VF;
if (Width > 1) {
InstWidening Decision = getWideningDecision(I, Width);
assert(Decision != CM_Unknown &&
"CM decision should be taken at this point");
if (Decision == CM_Scalarize)
Width = 1;
}
VectorTy = ToVectorTy(getMemInstValueType(I), Width);
VectorTy = ToVectorTy(getMemInstValueType(I), VF);
return getMemoryInstructionCost(I, VF);
}
case Instruction::ZExt:

@ -0,0 +1,52 @@
# RUN: llc -run-pass=prologepilog -verify-machineinstrs %s -o - | FileCheck %s
--- |
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-linux-gnu"
define void @ScavengeForFrameWithoutOffset() { ret void }
...
---
name: ScavengeForFrameWithoutOffset
tracksRegLiveness: true
stack:
- { id: 0, type: spill-slot, offset: 0, size: 32, alignment: 8 }
body: |
bb.0:
liveins: %d16_d17_d18_d19
%x0 = COPY %xzr
%x1 = COPY %xzr
%x2 = COPY %xzr
%x3 = COPY %xzr
%x4 = COPY %xzr
%x5 = COPY %xzr
%x6 = COPY %xzr
%x7 = COPY %xzr
%x8 = COPY %xzr
%x9 = COPY %xzr
%x10 = COPY %xzr
%x11 = COPY %xzr
%x12 = COPY %xzr
%x13 = COPY %xzr
%x14 = COPY %xzr
%x15 = COPY %xzr
%x16 = COPY %xzr
%x17 = COPY %xzr
%x18 = COPY %xzr
%x19 = COPY %xzr
%x20 = COPY %xzr
%x21 = COPY %xzr
%x22 = COPY %xzr
%x23 = COPY %xzr
%x24 = COPY %xzr
%x25 = COPY %xzr
%x26 = COPY %xzr
%x27 = COPY %xzr
%x28 = COPY %xzr
%fp = COPY %xzr
%lr = COPY %xzr
ST1Fourv1d killed %d16_d17_d18_d19, %stack.0 :: (store 32 into %stack.0, align 8)
# CHECK: STRXui killed %[[SCAVREG:x[0-9]+|fp|lr]], %sp, [[SPOFFSET:[0-9]+]] :: (store 8 into %stack.1)
# CHECK-NEXT: %[[SCAVREG]] = ADDXri %sp, {{[0-9]+}}, 0
# CHECK-NEXT: ST1Fourv1d killed %d16_d17_d18_d19, killed %[[SCAVREG]] :: (store 32 into %stack.0, align 8)
# CHECK-NEXT: %[[SCAVREG]] = LDRXui %sp, [[SPOFFSET]] :: (load 8 from %stack.1)
...

@ -23,7 +23,7 @@ define amdgpu_kernel void @v_test_add_v2i16(<2 x i16> addrspace(1)* %out, <2 x i
; GFX9: s_load_dword [[VAL0:s[0-9]+]]
; GFX9: s_load_dword [[VAL1:s[0-9]+]]
; GFX9: v_mov_b32_e32 [[VVAL1:v[0-9]+]]
; GFX9: v_pk_add_u16 v{{[0-9]+}}, [[VVAL1]], [[VAL0]]
; GFX9: v_pk_add_u16 v{{[0-9]+}}, [[VAL0]], [[VVAL1]]
; VI: s_add_i32
; VI: s_add_i32
@ -50,7 +50,7 @@ define amdgpu_kernel void @s_test_add_self_v2i16(<2 x i16> addrspace(1)* %out, <
; FIXME: VI should not scalarize arg access.
; GCN-LABEL: {{^}}s_test_add_v2i16_kernarg:
; GFX9: v_pk_add_u16 v{{[0-9]+}}, v{{[0-9]+}}, s{{[0-9]+}}
; GFX9: v_pk_add_u16 v{{[0-9]+}}, s{{[0-9]+}}, v{{[0-9]+}}
; VI: v_add_i32
; VI: v_add_i32_sdwa
@ -62,10 +62,11 @@ define amdgpu_kernel void @s_test_add_v2i16_kernarg(<2 x i16> addrspace(1)* %out
; GCN-LABEL: {{^}}v_test_add_v2i16_constant:
; GFX9: s_mov_b32 [[CONST:s[0-9]+]], 0x1c8007b{{$}}
; GFX9: v_pk_add_u16 v{{[0-9]+}}, [[CONST]], v{{[0-9]+}}
; GFX9: v_pk_add_u16 v{{[0-9]+}}, v{{[0-9]+}}, [[CONST]]
; VI-DAG: v_add_u16_e32 v{{[0-9]+}}, 0x7b, v{{[0-9]+}}
; VI-DAG: v_add_u16_e32 v{{[0-9]+}}, 0x1c8, v{{[0-9]+}}
; VI-DAG: v_mov_b32_e32 v[[SCONST:[0-9]+]], 0x1c8
; VI-DAG: v_add_u16_sdwa v{{[0-9]+}}, v[[SCONST]], v{{[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
define amdgpu_kernel void @v_test_add_v2i16_constant(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in0) #1 {
%tid = call i32 @llvm.amdgcn.workitem.id.x()
%gep.out = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %out, i32 %tid
@ -79,10 +80,11 @@ define amdgpu_kernel void @v_test_add_v2i16_constant(<2 x i16> addrspace(1)* %ou
; FIXME: Need to handle non-uniform case for function below (load without gep).
; GCN-LABEL: {{^}}v_test_add_v2i16_neg_constant:
; GFX9: s_mov_b32 [[CONST:s[0-9]+]], 0xfc21fcb3{{$}}
; GFX9: v_pk_add_u16 v{{[0-9]+}}, [[CONST]], v{{[0-9]+}}
; GFX9: v_pk_add_u16 v{{[0-9]+}}, v{{[0-9]+}}, [[CONST]]
; VI-DAG: v_add_u16_e32 v{{[0-9]+}}, 0xfffffcb3, v{{[0-9]+}}
; VI-DAG: v_add_u16_e32 v{{[0-9]+}}, 0xfffffc21, v{{[0-9]+}}
; VI-DAG: v_mov_b32_e32 v[[SCONST:[0-9]+]], 0xfffffc21
; VI-DAG: v_add_u16_sdwa v{{[0-9]+}}, v[[SCONST]], v{{[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
define amdgpu_kernel void @v_test_add_v2i16_neg_constant(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in0) #1 {
%tid = call i32 @llvm.amdgcn.workitem.id.x()
%gep.out = getelementptr inbounds <2 x i16>, <2 x i16> addrspace(1)* %out, i32 %tid
@ -96,11 +98,11 @@ define amdgpu_kernel void @v_test_add_v2i16_neg_constant(<2 x i16> addrspace(1)*
; GCN-LABEL: {{^}}v_test_add_v2i16_inline_neg1:
; GFX9: v_pk_add_u16 v{{[0-9]+}}, v{{[0-9]+}}, -1{{$}}
; VI: v_mov_b32_e32 v[[SCONST:[0-9]+]], -1
; VI: flat_load_ushort [[LOAD0:v[0-9]+]]
; VI: flat_load_ushort [[LOAD1:v[0-9]+]]
; VI-DAG: v_add_u16_e32 v{{[0-9]+}}, -1, [[LOAD0]]
; VI-DAG: v_add_u16_sdwa v{{[0-9]+}}, v[[SCONST]], [[LOAD0]] dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
; VI-DAG: v_add_u16_e32 v{{[0-9]+}}, -1, [[LOAD1]]
; VI-DAG: v_lshlrev_b32_e32 v{{[0-9]+}}, 16,
; VI: v_or_b32_e32
define amdgpu_kernel void @v_test_add_v2i16_inline_neg1(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in0) #1 {
%tid = call i32 @llvm.amdgcn.workitem.id.x()
@ -114,7 +116,7 @@ define amdgpu_kernel void @v_test_add_v2i16_inline_neg1(<2 x i16> addrspace(1)*
; GCN-LABEL: {{^}}v_test_add_v2i16_inline_lo_zero_hi:
; GFX9: s_mov_b32 [[K:s[0-9]+]], 32{{$}}
; GFX9: v_pk_add_u16 v{{[0-9]+}}, [[K]], v{{[0-9]+}}{{$}}
; GFX9: v_pk_add_u16 v{{[0-9]+}}, v{{[0-9]+}}, [[K]]{{$}}
; VI-NOT: v_add_u16
; VI: v_add_u16_e32 v{{[0-9]+}}, 32, v{{[0-9]+}}
@ -134,12 +136,12 @@ define amdgpu_kernel void @v_test_add_v2i16_inline_lo_zero_hi(<2 x i16> addrspac
; The high element gives fp
; GCN-LABEL: {{^}}v_test_add_v2i16_inline_fp_split:
; GFX9: s_mov_b32 [[K:s[0-9]+]], 1.0
; GFX9: v_pk_add_u16 v{{[0-9]+}}, [[K]], v{{[0-9]+}}{{$}}
; GFX9: v_pk_add_u16 v{{[0-9]+}}, v{{[0-9]+}}, [[K]]{{$}}
; VI-NOT: v_add_u16
; VI: v_add_u16_e32 v{{[0-9]+}}, 0x3f80, v{{[0-9]+}}
; VI: v_mov_b32_e32 v[[K:[0-9]+]], 0x3f80
; VI: v_add_u16_sdwa v{{[0-9]+}}, v[[K]], v{{[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
; VI-NOT: v_add_u16
; VI: v_lshlrev_b32_e32 v{{[0-9]+}}, 16,
; VI: v_or_b32_e32
define amdgpu_kernel void @v_test_add_v2i16_inline_fp_split(<2 x i16> addrspace(1)* %out, <2 x i16> addrspace(1)* %in0) #1 {
%tid = call i32 @llvm.amdgcn.workitem.id.x()
@ -191,19 +193,17 @@ define amdgpu_kernel void @v_test_add_v2i16_zext_to_v2i32(<2 x i32> addrspace(1)
; GFX9: flat_load_dword [[A:v[0-9]+]]
; GFX9: flat_load_dword [[B:v[0-9]+]]
; GFX9: v_mov_b32_e32 v{{[0-9]+}}, 0{{$}}
; GFX9: v_pk_add_u16 [[ADD:v[0-9]+]], [[A]], [[B]]
; GFX9-DAG: v_and_b32_e32 v[[ELT0:[0-9]+]], 0xffff, [[ADD]]
; GFX9-DAG: v_lshrrev_b32_e32 v[[ELT1:[0-9]+]], 16, [[ADD]]
; GFX9: buffer_store_dwordx4
; VI-DAG: v_mov_b32_e32 v{{[0-9]+}}, 0{{$}}
; VI: flat_load_ushort v[[A_LO:[0-9]+]]
; VI: flat_load_ushort v[[A_HI:[0-9]+]]
; VI: flat_load_ushort v[[B_LO:[0-9]+]]
; VI: flat_load_ushort v[[B_HI:[0-9]+]]
; VI-DAG: v_mov_b32_e32 v{{[0-9]+}}, 0{{$}}
; VI-DAG: v_mov_b32_e32 v{{[0-9]+}}, 0{{$}}
; VI-DAG: v_add_u16_e32
; VI-DAG: v_add_u16_e32

@ -1,12 +1,16 @@
; RUN: llc -march=amdgcn -mcpu=fiji < %s | FileCheck --check-prefix=GCN --check-prefix=VI %s
; RUN: llc -march=amdgcn -mcpu=fiji -amdgpu-sdwa-peephole=0 < %s | FileCheck --check-prefix=GCN --check-prefix=VI %s
; RUN: llc -march=amdgcn -mcpu=fiji < %s | FileCheck --check-prefix=GCN --check-prefix=VI-SDWA %s
; RUN: llc -march=amdgcn -mcpu=bonaire < %s | FileCheck --check-prefix=GCN --check-prefix=CI %s
; GCN-LABEL: {{^}}bfe_combine8:
; VI: v_bfe_u32 v[[BFE:[0-9]+]], v{{[0-9]+}}, 8, 8
; VI: v_lshlrev_b32_e32 v[[ADDRBASE:[0-9]+]], 2, v[[BFE]]
; VI-SDWA: v_mov_b32_e32 v[[SHIFT:[0-9]+]], 2
; VI-SDWA: v_lshlrev_b32_sdwa v[[ADDRBASE:[0-9]+]], v[[SHIFT]], v{{[0-9]+}} dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:BYTE_1
; CI: v_lshrrev_b32_e32 v[[SHR:[0-9]+]], 6, v{{[0-9]+}}
; CI: v_and_b32_e32 v[[ADDRLO:[0-9]+]], 0x3fc, v[[SHR]]
; VI: v_add_i32_e32 v[[ADDRLO:[0-9]+]], vcc, s{{[0-9]+}}, v[[ADDRBASE]]
; VI-SDWA: v_add_i32_e32 v[[ADDRLO:[0-9]+]], vcc, s{{[0-9]+}}, v[[ADDRBASE]]
; GCN: load_dword v{{[0-9]+}}, v{{\[}}[[ADDRLO]]:
define amdgpu_kernel void @bfe_combine8(i32 addrspace(1)* nocapture %arg, i32 %x) {
%id = tail call i32 @llvm.amdgcn.workitem.id.x() #2
@ -22,6 +26,10 @@ define amdgpu_kernel void @bfe_combine8(i32 addrspace(1)* nocapture %arg, i32 %x
; GCN-LABEL: {{^}}bfe_combine16:
; VI: v_bfe_u32 v[[BFE:[0-9]+]], v{{[0-9]+}}, 16, 16
; VI: v_lshlrev_b32_e32 v[[ADDRBASE:[0-9]+]], {{[^,]+}}, v[[BFE]]
; VI-SDWA: v_mov_b32_e32 v[[SHIFT:[0-9]+]], 15
; VI-SDWA: v_lshlrev_b32_sdwa v[[ADDRBASE1:[0-9]+]], v[[SHIFT]], v{{[0-9]+}} dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1
; VI-SDWA: v_lshlrev_b64 v{{\[}}[[ADDRBASE:[0-9]+]]:{{[^\]+}}], 2, v{{\[}}[[ADDRBASE1]]:{{[^\]+}}]
; VI-SDWA: v_add_i32_e32 v[[ADDRLO:[0-9]+]], vcc, s{{[0-9]+}}, v[[ADDRBASE]]
; CI: v_lshrrev_b32_e32 v[[SHR:[0-9]+]], 1, v{{[0-9]+}}
; CI: v_and_b32_e32 v[[AND:[0-9]+]], 0x7fff8000, v[[SHR]]
; CI: v_lshl_b64 v{{\[}}[[ADDRLO:[0-9]+]]:{{[^\]+}}], v{{\[}}[[AND]]:{{[^\]+}}], 2

@ -1,4 +1,4 @@
; RUN: llc -march=amdgcn -verify-machineinstrs < %s | FileCheck -check-prefix=GCN -check-prefix=SI %s
; RUN: llc -march=amdgcn -amdgpu-sdwa-peephole=0 -verify-machineinstrs < %s | FileCheck -check-prefix=GCN -check-prefix=SI %s
declare i32 @llvm.amdgcn.workitem.id.x() #0

@ -51,7 +51,7 @@ define amdgpu_kernel void @commute_mul_imm_fneg_f32(float addrspace(1)* %out, fl
; FUNC-LABEL: @commute_add_lit_fabs_f32
; SI: buffer_load_dword [[X:v[0-9]+]], {{v\[[0-9]+:[0-9]+\]}}, {{s\[[0-9]+:[0-9]+\]}}, 0 addr64{{$}}
; SI: v_mov_b32_e32 [[K:v[0-9]+]], 0x44800000
; SI: v_add_f32_e64 [[REG:v[0-9]+]], [[K]], |[[X]]|
; SI: v_add_f32_e64 [[REG:v[0-9]+]], |[[X]]|, [[K]]
; SI: buffer_store_dword [[REG]]
define amdgpu_kernel void @commute_add_lit_fabs_f32(float addrspace(1)* %out, float addrspace(1)* %in) #0 {
%tid = call i32 @llvm.amdgcn.workitem.id.x() #1

@ -1,5 +1,5 @@
; RUN: llc -march=amdgcn -mcpu=tahiti < %s | FileCheck -check-prefix=GCN -check-prefix=SI -check-prefix=FUNC %s
; RUN: llc -march=amdgcn -mcpu=tonga -mattr=-flat-for-global < %s | FileCheck -check-prefix=GCN -check-prefix=VI -check-prefix=FUNC %s
; RUN: llc -march=amdgcn -mcpu=tonga -mattr=-flat-for-global -amdgpu-sdwa-peephole=0 < %s | FileCheck -check-prefix=GCN -check-prefix=VI -check-prefix=FUNC %s
declare i32 @llvm.amdgcn.workitem.id.x() nounwind readnone
declare i32 @llvm.amdgcn.workitem.id.y() nounwind readnone

@ -94,7 +94,6 @@ define amdgpu_kernel void @load_v4i8_to_v4f32_unaligned(<4 x float> addrspace(1)
; GCN-DAG: v_cvt_f32_ubyte3_e32
; GCN-DAG: v_lshrrev_b32_e32 v{{[0-9]+}}, 24
; GCN-DAG: v_lshrrev_b32_e32 v{{[0-9]+}}, 16
; SI-DAG: v_lshlrev_b32_e32 v{{[0-9]+}}, 16
; SI-DAG: v_lshlrev_b32_e32 v{{[0-9]+}}, 8

@ -55,7 +55,7 @@ define amdgpu_kernel void @fabs_v4f64(<4 x double> addrspace(1)* %out, <4 x doub
; SI-LABEL: {{^}}fabs_fold_f64:
; SI: s_load_dwordx2 [[ABS_VALUE:s\[[0-9]+:[0-9]+\]]], {{s\[[0-9]+:[0-9]+\]}}, 0xb
; SI-NOT: and
; SI: v_mul_f64 {{v\[[0-9]+:[0-9]+\]}}, {{v\[[0-9]+:[0-9]+\]}}, |[[ABS_VALUE]]|
; SI: v_mul_f64 {{v\[[0-9]+:[0-9]+\]}}, |[[ABS_VALUE]]|, {{v\[[0-9]+:[0-9]+\]}}
; SI: s_endpgm
define amdgpu_kernel void @fabs_fold_f64(double addrspace(1)* %out, double %in0, double %in1) {
%fabs = call double @llvm.fabs.f64(double %in0)
@ -67,7 +67,7 @@ define amdgpu_kernel void @fabs_fold_f64(double addrspace(1)* %out, double %in0,
; SI-LABEL: {{^}}fabs_fn_fold_f64:
; SI: s_load_dwordx2 [[ABS_VALUE:s\[[0-9]+:[0-9]+\]]], {{s\[[0-9]+:[0-9]+\]}}, 0xb
; SI-NOT: and
; SI: v_mul_f64 {{v\[[0-9]+:[0-9]+\]}}, {{v\[[0-9]+:[0-9]+\]}}, |[[ABS_VALUE]]|
; SI: v_mul_f64 {{v\[[0-9]+:[0-9]+\]}}, |[[ABS_VALUE]]|, {{v\[[0-9]+:[0-9]+\]}}
; SI: s_endpgm
define amdgpu_kernel void @fabs_fn_fold_f64(double addrspace(1)* %out, double %in0, double %in1) {
%fabs = call double @fabs(double %in0)

@ -75,7 +75,7 @@ define amdgpu_kernel void @fabs_v4f32(<4 x float> addrspace(1)* %out, <4 x float
; SI: s_load_dword [[ABS_VALUE:s[0-9]+]], s[{{[0-9]+:[0-9]+}}], 0xb
; VI: s_load_dword [[ABS_VALUE:s[0-9]+]], s[{{[0-9]+:[0-9]+}}], 0x2c
; GCN-NOT: and
; GCN: v_mul_f32_e64 v{{[0-9]+}}, v{{[0-9]+}}, |[[ABS_VALUE]]|
; GCN: v_mul_f32_e64 v{{[0-9]+}}, |[[ABS_VALUE]]|, v{{[0-9]+}}
define amdgpu_kernel void @fabs_fn_fold(float addrspace(1)* %out, float %in0, float %in1) {
%fabs = call float @fabs(float %in0)
%fmul = fmul float %fabs, %in1
@ -87,7 +87,7 @@ define amdgpu_kernel void @fabs_fn_fold(float addrspace(1)* %out, float %in0, fl
; SI: s_load_dword [[ABS_VALUE:s[0-9]+]], s[{{[0-9]+:[0-9]+}}], 0xb
; VI: s_load_dword [[ABS_VALUE:s[0-9]+]], s[{{[0-9]+:[0-9]+}}], 0x2c
; GCN-NOT: and
; GCN: v_mul_f32_e64 v{{[0-9]+}}, v{{[0-9]+}}, |[[ABS_VALUE]]|
; GCN: v_mul_f32_e64 v{{[0-9]+}}, |[[ABS_VALUE]]|, v{{[0-9]+}}
define amdgpu_kernel void @fabs_fold(float addrspace(1)* %out, float %in0, float %in1) {
%fabs = call float @llvm.fabs.f32(float %in0)
%fmul = fmul float %fabs, %in1

@ -96,9 +96,9 @@ entry:
}
; GCN-LABEL: {{^}}fadd_v2f16_imm_a:
; GCN: buffer_load_dword v[[B_V2_F16:[0-9]+]]
; GCN-DAG: buffer_load_dword v[[B_V2_F16:[0-9]+]]
; SI: v_cvt_f32_f16_e32 v[[B_F32_0:[0-9]+]], v[[B_V2_F16]]
; GCN: v_lshrrev_b32_e32 v[[B_F16_1:[0-9]+]], 16, v[[B_V2_F16]]
; SI: v_lshrrev_b32_e32 v[[B_F16_1:[0-9]+]], 16, v[[B_V2_F16]]
; SI: v_cvt_f32_f16_e32 v[[B_F32_1:[0-9]+]], v[[B_F16_1]]
; SI: v_add_f32_e32 v[[R_F32_0:[0-9]+]], 1.0, v[[B_F32_0]]
; SI: v_cvt_f16_f32_e32 v[[R_F16_0:[0-9]+]], v[[R_F32_0]]
@ -107,9 +107,9 @@ entry:
; SI-DAG: v_lshlrev_b32_e32 v[[R_F16_HI:[0-9]+]], 16, v[[R_F16_1]]
; SI: v_or_b32_e32 v[[R_V2_F16:[0-9]+]], v[[R_F16_HI]], v[[R_F16_0]]
; VI-DAG: v_add_f16_e32 v[[R_F16_1:[0-9]+]], 2.0, v[[B_F16_1]]
; VI-DAG: v_mov_b32_e32 v[[CONST2:[0-9]+]], 0x4000
; VI-DAG: v_add_f16_sdwa v[[R_F16_HI:[0-9]+]], v[[CONST2]], v[[B_V2_F16]] dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1
; VI-DAG: v_add_f16_e32 v[[R_F16_0:[0-9]+]], 1.0, v[[B_V2_F16]]
; VI-DAG: v_lshlrev_b32_e32 v[[R_F16_HI:[0-9]+]], 16, v[[R_F16_1]]
; VI: v_or_b32_e32 v[[R_V2_F16:[0-9]+]], v[[R_F16_HI]], v[[R_F16_0]]
; GCN: buffer_store_dword v[[R_V2_F16]]
@ -125,9 +125,9 @@ entry:
}
; GCN-LABEL: {{^}}fadd_v2f16_imm_b:
; GCN: buffer_load_dword v[[A_V2_F16:[0-9]+]]
; GCN-DAG: buffer_load_dword v[[A_V2_F16:[0-9]+]]
; SI: v_cvt_f32_f16_e32 v[[A_F32_0:[0-9]+]], v[[A_V2_F16]]
; GCN: v_lshrrev_b32_e32 v[[A_F16_1:[0-9]+]], 16, v[[A_V2_F16]]
; SI-DAG: v_lshrrev_b32_e32 v[[A_F16_1:[0-9]+]], 16, v[[A_V2_F16]]
; SI: v_cvt_f32_f16_e32 v[[A_F32_1:[0-9]+]], v[[A_F16_1]]
; SI: v_add_f32_e32 v[[R_F32_0:[0-9]+]], 2.0, v[[A_F32_0]]
; SI: v_cvt_f16_f32_e32 v[[R_F16_0:[0-9]+]], v[[R_F32_0]]
@ -136,10 +136,10 @@ entry:
; SI-DAG: v_lshlrev_b32_e32 v[[R_F16_HI:[0-9]+]], 16, v[[R_F16_1]]
; SI: v_or_b32_e32 v[[R_V2_F16:[0-9]+]], v[[R_F16_HI]], v[[R_F16_0]]
; VI-DAG: v_add_f16_e32 v[[R_F16_0:[0-9]+]], 1.0, v[[A_F16_1]]
; VI-DAG: v_mov_b32_e32 v[[CONST1:[0-9]+]], 0x3c00
; VI-DAG: v_add_f16_sdwa v[[R_F16_0:[0-9]+]], v[[CONST1]], v[[A_V2_F16]] dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1
; VI-DAG: v_add_f16_e32 v[[R_F16_1:[0-9]+]], 2.0, v[[A_V2_F16]]
; VI-DAG: v_lshlrev_b32_e32 v[[R_F16_HI:[0-9]+]], 16, v[[R_F16_0]]
; VI: v_or_b32_e32 v[[R_V2_F16:[0-9]+]], v[[R_F16_HI]], v[[R_F16_1]]
; VI: v_or_b32_e32 v[[R_V2_F16:[0-9]+]], v[[R_F16_0]], v[[R_F16_1]]
; GCN: buffer_store_dword v[[R_V2_F16]]
; GCN: s_endpgm

@ -13,7 +13,7 @@ define amdgpu_kernel void @v_fadd_f64(double addrspace(1)* %out, double addrspac
}
; CHECK-LABEL: {{^}}s_fadd_f64:
; CHECK: v_add_f64 {{v\[[0-9]+:[0-9]+\]}}, {{v\[[0-9]+:[0-9]+\]}}, {{s\[[0-9]+:[0-9]+\]}}
; CHECK: v_add_f64 {{v\[[0-9]+:[0-9]+\]}}, {{s\[[0-9]+:[0-9]+\]}}, {{v\[[0-9]+:[0-9]+\]}}
define amdgpu_kernel void @s_fadd_f64(double addrspace(1)* %out, double %r0, double %r1) {
%r2 = fadd double %r0, %r1
store double %r2, double addrspace(1)* %out

@ -205,9 +205,9 @@ define amdgpu_kernel void @test_fold_canonicalize_snan3_value_f16(half addrspace
}
; GCN-LABEL: {{^}}v_test_canonicalize_var_v2f16:
; VI: v_mul_f16_e32 [[REG0:v[0-9]+]], 1.0, {{v[0-9]+}}
; VI: v_mov_b32_e32 v[[CONST1:[0-9]+]], 0x3c00
; VI-DAG: v_mul_f16_sdwa [[REG0:v[0-9]+]], v[[CONST1]], {{v[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1
; VI-DAG: v_mul_f16_e32 [[REG1:v[0-9]+]], 1.0, {{v[0-9]+}}
; VI-DAG: v_lshlrev_b32_e32 v{{[0-9]+}}, 16,
; VI-NOT: v_and_b32
; GFX9: v_pk_mul_f16 [[REG:v[0-9]+]], 1.0, {{v[0-9]+$}}
@ -223,7 +223,8 @@ define amdgpu_kernel void @v_test_canonicalize_var_v2f16(<2 x half> addrspace(1)
; GCN-LABEL: {{^}}v_test_canonicalize_fabs_var_v2f16:
; VI-DAG: v_bfe_u32
; VI-DAG: v_and_b32_e32 v{{[0-9]+}}, 0x7fff7fff, v{{[0-9]+}}
; VI: v_mul_f16_e32 [[REG0:v[0-9]+]], 1.0, v{{[0-9]+}}
; VI-DAG: v_mov_b32_e32 v[[CONST1:[0-9]+]], 0x3c00
; VI: v_mul_f16_sdwa [[REG0:v[0-9]+]], v[[CONST1]], v{{[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
; VI: v_mul_f16_e32 [[REG1:v[0-9]+]], 1.0, v{{[0-9]+}}
; VI-NOT: 0xffff
; VI: v_or_b32
@ -240,9 +241,10 @@ define amdgpu_kernel void @v_test_canonicalize_fabs_var_v2f16(<2 x half> addrspa
}
; GCN-LABEL: {{^}}v_test_canonicalize_fneg_fabs_var_v2f16:
; VI: v_or_b32_e32 v{{[0-9]+}}, 0x80008000, v{{[0-9]+}}
; VI: v_mul_f16_e32 [[REG0:v[0-9]+]], 1.0, v{{[0-9]+}}
; VI: v_mul_f16_e32 [[REG1:v[0-9]+]], 1.0, v{{[0-9]+}}
; VI-DAG: v_mov_b32_e32 v[[CONST1:[0-9]+]], 0x3c00
; VI-DAG: v_or_b32_e32 v{{[0-9]+}}, 0x80008000, v{{[0-9]+}}
; VI-DAG: v_mul_f16_sdwa [[REG0:v[0-9]+]], v[[CONST1]], v{{[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1
; VI-DAG: v_mul_f16_e32 [[REG1:v[0-9]+]], 1.0, v{{[0-9]+}}
; VI: v_or_b32
; GFX9: v_and_b32_e32 [[ABS:v[0-9]+]], 0x7fff7fff, v{{[0-9]+}}
@ -259,11 +261,10 @@ define amdgpu_kernel void @v_test_canonicalize_fneg_fabs_var_v2f16(<2 x half> ad
; FIXME: Fold modifier
; GCN-LABEL: {{^}}v_test_canonicalize_fneg_var_v2f16:
; VI: v_xor_b32_e32 [[FNEG:v[0-9]+]], 0x80008000, v{{[0-9]+}}
; VI-DAG: v_lshrrev_b32_e32 [[FNEG_HI:v[0-9]+]], 16, [[FNEG]]
; VI-DAG: v_mul_f16_e32 [[REG1:v[0-9]+]], 1.0, [[FNEG_HI]]
; VI-DAG: v_mov_b32_e32 v[[CONST1:[0-9]+]], 0x3c00
; VI-DAG: v_xor_b32_e32 [[FNEG:v[0-9]+]], 0x80008000, v{{[0-9]+}}
; VI-DAG: v_mul_f16_sdwa [[REG1:v[0-9]+]], v[[CONST1]], [[FNEG]] dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1
; VI-DAG: v_mul_f16_e32 [[REG0:v[0-9]+]], 1.0, [[FNEG]]
; VI-DAG: v_lshlrev_b32_e32 v{{[0-9]+}}, 16,
; VI-NOT: 0xffff
; GFX9: v_pk_mul_f16 [[REG:v[0-9]+]], 1.0, {{v[0-9]+}} neg_lo:[0,1] neg_hi:[0,1]{{$}}

@ -96,17 +96,18 @@ entry:
}
; GCN-LABEL: {{^}}fmul_v2f16_imm_a:
; GCN: buffer_load_dword v[[B_V2_F16:[0-9]+]]
; GCN-DAG: buffer_load_dword v[[B_V2_F16:[0-9]+]]
; SI: v_cvt_f32_f16_e32 v[[B_F32_0:[0-9]+]], v[[B_V2_F16]]
; GCN: v_lshrrev_b32_e32 v[[B_F16_1:[0-9]+]], 16, v[[B_V2_F16]]
; SI: v_lshrrev_b32_e32 v[[B_F16_1:[0-9]+]], 16, v[[B_V2_F16]]
; SI: v_cvt_f32_f16_e32 v[[B_F32_1:[0-9]+]], v[[B_F16_1]]
; SI: v_mul_f32_e32 v[[R_F32_0:[0-9]+]], 0x40400000, v[[B_F32_0]]
; SI: v_cvt_f16_f32_e32 v[[R_F16_0:[0-9]+]], v[[R_F32_0]]
; SI: v_mul_f32_e32 v[[R_F32_1:[0-9]+]], 4.0, v[[B_F32_1]]
; SI: v_cvt_f16_f32_e32 v[[R_F16_1:[0-9]+]], v[[R_F32_1]]
; VI-DAG: v_mul_f16_e32 v[[R_F16_1:[0-9]+]], 4.0, v[[B_F16_1]]
; VI-DAG: v_mov_b32_e32 v[[CONST4:[0-9]+]], 0x4400
; VI-DAG: v_mul_f16_sdwa v[[R_F16_HI:[0-9]+]], v[[CONST4]], v[[B_V2_F16]] dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1
; VI-DAG: v_mul_f16_e32 v[[R_F16_0:[0-9]+]], 0x4200, v[[B_V2_F16]]
; GCN-DAG: v_lshlrev_b32_e32 v[[R_F16_HI:[0-9]+]], 16, v[[R_F16_1]]
; SI-DAG: v_lshlrev_b32_e32 v[[R_F16_HI:[0-9]+]], 16, v[[R_F16_1]]
; GCN: v_or_b32_e32 v[[R_V2_F16:[0-9]+]], v[[R_F16_HI]], v[[R_F16_0]]
; GCN: buffer_store_dword v[[R_V2_F16]]
; GCN: s_endpgm
@ -121,17 +122,18 @@ entry:
}
; GCN-LABEL: {{^}}fmul_v2f16_imm_b:
; GCN: buffer_load_dword v[[A_V2_F16:[0-9]+]]
; GCN-DAG: buffer_load_dword v[[A_V2_F16:[0-9]+]]
; SI: v_cvt_f32_f16_e32 v[[A_F32_0:[0-9]+]], v[[A_V2_F16]]
; GCN: v_lshrrev_b32_e32 v[[A_F16_1:[0-9]+]], 16, v[[A_V2_F16]]
; SI: v_lshrrev_b32_e32 v[[A_F16_1:[0-9]+]], 16, v[[A_V2_F16]]
; SI: v_cvt_f32_f16_e32 v[[A_F32_1:[0-9]+]], v[[A_F16_1]]
; SI: v_mul_f32_e32 v[[R_F32_0:[0-9]+]], 4.0, v[[A_F32_0]]
; SI: v_cvt_f16_f32_e32 v[[R_F16_0:[0-9]+]], v[[R_F32_0]]
; SI: v_mul_f32_e32 v[[R_F32_1:[0-9]+]], 0x40400000, v[[A_F32_1]]
; SI: v_cvt_f16_f32_e32 v[[R_F16_1:[0-9]+]], v[[R_F32_1]]
; VI-DAG: v_mul_f16_e32 v[[R_F16_1:[0-9]+]], 0x4200, v[[A_F16_1]]
; VI-DAG: v_mov_b32_e32 v[[CONST3:[0-9]+]], 0x4200
; VI-DAG: v_mul_f16_sdwa v[[R_F16_HI:[0-9]+]], v[[CONST3]], v[[A_V2_F16]] dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1
; VI-DAG: v_mul_f16_e32 v[[R_F16_0:[0-9]+]], 4.0, v[[A_V2_F16]]
; GCN-DAG: v_lshlrev_b32_e32 v[[R_F16_HI:[0-9]+]], 16, v[[R_F16_1]]
; SI-DAG: v_lshlrev_b32_e32 v[[R_F16_HI:[0-9]+]], 16, v[[R_F16_1]]
; GCN: v_or_b32_e32 v[[R_V2_F16:[0-9]+]], v[[R_F16_HI]], v[[R_F16_0]]
; GCN: buffer_store_dword v[[R_V2_F16]]
; GCN: s_endpgm

@ -71,7 +71,9 @@ define amdgpu_kernel void @v_fneg_fabs_f16(half addrspace(1)* %out, half addrspa
; FIXME: single bit op
; GCN-LABEL: {{^}}s_fneg_fabs_v2f16:
; CIVI: s_mov_b32 [[MASK:s[0-9]+]], 0x8000{{$}}
; CIVI: v_or_b32_e32 v{{[0-9]+}}, [[MASK]],
; VI: v_mov_b32_e32 [[VMASK:v[0-9]+]], [[MASK]]
; CI: v_or_b32_e32 v{{[0-9]+}}, [[MASK]],
; VI: v_or_b32_sdwa v{{[0-9]+}}, [[VMASK]], v{{[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
; CIVI: v_or_b32_e32 v{{[0-9]+}}, [[MASK]],
; CIVI: flat_store_dword
@ -85,10 +87,15 @@ define amdgpu_kernel void @s_fneg_fabs_v2f16(<2 x half> addrspace(1)* %out, <2 x
; GCN-LABEL: {{^}}fneg_fabs_v4f16:
; CIVI: s_mov_b32 [[MASK:s[0-9]+]], 0x8000{{$}}
; CIVI: v_or_b32_e32 v{{[0-9]+}}, [[MASK]],
; CIVI: v_or_b32_e32 v{{[0-9]+}}, [[MASK]],
; CIVI: v_or_b32_e32 v{{[0-9]+}}, [[MASK]],
; CIVI: v_or_b32_e32 v{{[0-9]+}}, [[MASK]],
; CI: v_or_b32_e32 v{{[0-9]+}}, [[MASK]],
; CI: v_or_b32_e32 v{{[0-9]+}}, [[MASK]],
; CI: v_or_b32_e32 v{{[0-9]+}}, [[MASK]],
; CI: v_or_b32_e32 v{{[0-9]+}}, [[MASK]],
; VI: v_mov_b32_e32 [[VMASK:v[0-9]+]], [[MASK]]
; VI: v_or_b32_sdwa v{{[0-9]+}}, [[VMASK]], v{{[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
; VI: v_or_b32_e32 v{{[0-9]+}}, [[MASK]],
; VI: v_or_b32_sdwa v{{[0-9]+}}, [[VMASK]], v{{[0-9]+}} dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:DWORD
; VI: v_or_b32_e32 v{{[0-9]+}}, [[MASK]],
; GFX9: s_mov_b32 [[MASK:s[0-9]+]], 0x80008000
; GFX9: s_or_b32 s{{[0-9]+}}, [[MASK]], s{{[0-9]+}}

@ -5,7 +5,7 @@
; into 2 modifiers, although theoretically that should work.
; GCN-LABEL: {{^}}fneg_fabs_fadd_f64:
; GCN: v_add_f64 {{v\[[0-9]+:[0-9]+\]}}, -|v{{\[[0-9]+:[0-9]+\]}}|, {{s\[[0-9]+:[0-9]+\]}}
; GCN: v_add_f64 {{v\[[0-9]+:[0-9]+\]}}, {{s\[[0-9]+:[0-9]+\]}}, -|v{{\[[0-9]+:[0-9]+\]}}|
define amdgpu_kernel void @fneg_fabs_fadd_f64(double addrspace(1)* %out, double %x, double %y) {
%fabs = call double @llvm.fabs.f64(double %x)
%fsub = fsub double -0.000000e+00, %fabs
@ -25,7 +25,7 @@ define amdgpu_kernel void @v_fneg_fabs_fadd_f64(double addrspace(1)* %out, doubl
}
; GCN-LABEL: {{^}}fneg_fabs_fmul_f64:
; GCN: v_mul_f64 {{v\[[0-9]+:[0-9]+\]}}, -|{{v\[[0-9]+:[0-9]+\]}}|, {{s\[[0-9]+:[0-9]+\]}}
; GCN: v_mul_f64 {{v\[[0-9]+:[0-9]+\]}}, {{s\[[0-9]+:[0-9]+\]}}, -|v{{\[[0-9]+:[0-9]+\]}}|
define amdgpu_kernel void @fneg_fabs_fmul_f64(double addrspace(1)* %out, double %x, double %y) {
%fabs = call double @llvm.fabs.f64(double %x)
%fsub = fsub double -0.000000e+00, %fabs

@ -4,7 +4,7 @@
; FUNC-LABEL: {{^}}fneg_fabs_fadd_f32:
; SI-NOT: and
; SI: v_subrev_f32_e64 {{v[0-9]+}}, |{{v[0-9]+}}|, {{s[0-9]+}}
; SI: v_sub_f32_e64 {{v[0-9]+}}, {{s[0-9]+}}, |{{v[0-9]+}}|
define amdgpu_kernel void @fneg_fabs_fadd_f32(float addrspace(1)* %out, float %x, float %y) {
%fabs = call float @llvm.fabs.f32(float %x)
%fsub = fsub float -0.000000e+00, %fabs
@ -15,7 +15,7 @@ define amdgpu_kernel void @fneg_fabs_fadd_f32(float addrspace(1)* %out, float %x
; FUNC-LABEL: {{^}}fneg_fabs_fmul_f32:
; SI-NOT: and
; SI: v_mul_f32_e64 {{v[0-9]+}}, -|{{v[0-9]+}}|, {{s[0-9]+}}
; SI: v_mul_f32_e64 {{v[0-9]+}}, {{s[0-9]+}}, -|{{v[0-9]+}}|
; SI-NOT: and
define amdgpu_kernel void @fneg_fabs_fmul_f32(float addrspace(1)* %out, float %x, float %y) {
%fabs = call float @llvm.fabs.f32(float %x)

@ -130,13 +130,15 @@ define amdgpu_kernel void @v_fneg_fold_v2f16(<2 x half> addrspace(1)* %out, <2 x
}
; GCN-LABEL: {{^}}v_extract_fneg_fold_v2f16:
; GCN: flat_load_dword [[VAL:v[0-9]+]]
; GCN-DAG: flat_load_dword [[VAL:v[0-9]+]]
; CI-DAG: v_mul_f32_e32 v{{[0-9]+}}, -4.0, v{{[0-9]+}}
; CI-DAG: v_sub_f32_e32 v{{[0-9]+}}, 2.0, v{{[0-9]+}}
; GFX89: v_lshrrev_b32_e32 [[ELT1:v[0-9]+]], 16, [[VAL]]
; GFX9: v_lshrrev_b32_e32 [[ELT1:v[0-9]+]], 16, [[VAL]]
; GFX89-DAG: v_mul_f16_e32 v{{[0-9]+}}, -4.0, [[VAL]]
; GFX89-DAG: v_sub_f16_e32 v{{[0-9]+}}, 2.0, [[ELT1]]
; GFX9-DAG: v_sub_f16_e32 v{{[0-9]+}}, 2.0, [[ELT1]]
; VI-DAG: v_mov_b32_e32 [[CONST2:v[0-9]+]], 0x4000
; VI-DAG: v_sub_f16_sdwa v{{[0-9]+}}, [[CONST2]], [[VAL]] dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:DWORD src1_sel:WORD_1
define amdgpu_kernel void @v_extract_fneg_fold_v2f16(<2 x half> addrspace(1)* %in) #0 {
%val = load <2 x half>, <2 x half> addrspace(1)* %in
%fneg = fsub <2 x half> <half -0.0, half -0.0>, %val

@ -12,7 +12,7 @@ declare double @llvm.floor.f64(double) #0
; SI-DAG: v_fract_f64_e32 [[FRC:v\[[0-9]+:[0-9]+\]]], v{{\[}}[[LO:[0-9]+]]:[[HI:[0-9]+]]]
; SI-DAG: v_mov_b32_e32 v[[UPLO:[0-9]+]], -1
; SI-DAG: v_mov_b32_e32 v[[UPHI:[0-9]+]], 0x3fefffff
; SI-DAG: v_min_f64 v{{\[}}[[MINLO:[0-9]+]]:[[MINHI:[0-9]+]]], v{{\[}}[[UPLO]]:[[UPHI]]], [[FRC]]
; SI-DAG: v_min_f64 v{{\[}}[[MINLO:[0-9]+]]:[[MINHI:[0-9]+]]], [[FRC]], v{{\[}}[[UPLO]]:[[UPHI]]]
; SI-DAG: v_cmp_class_f64_e64 vcc, v{{\[}}[[LO]]:[[HI]]], 3
; SI: v_cndmask_b32_e32 v[[RESLO:[0-9]+]], v[[MINLO]], v[[LO]], vcc
; SI: v_cndmask_b32_e32 v[[RESHI:[0-9]+]], v[[MINHI]], v[[HI]], vcc
@ -39,7 +39,7 @@ define amdgpu_kernel void @fract_f64(double addrspace(1)* %out, double addrspace
; SI-DAG: v_fract_f64_e64 [[FRC:v\[[0-9]+:[0-9]+\]]], -v{{\[}}[[LO:[0-9]+]]:[[HI:[0-9]+]]]
; SI-DAG: v_mov_b32_e32 v[[UPLO:[0-9]+]], -1
; SI-DAG: v_mov_b32_e32 v[[UPHI:[0-9]+]], 0x3fefffff
; SI-DAG: v_min_f64 v{{\[}}[[MINLO:[0-9]+]]:[[MINHI:[0-9]+]]], v{{\[}}[[UPLO]]:[[UPHI]]], [[FRC]]
; SI-DAG: v_min_f64 v{{\[}}[[MINLO:[0-9]+]]:[[MINHI:[0-9]+]]], [[FRC]], v{{\[}}[[UPLO]]:[[UPHI]]]
; SI-DAG: v_cmp_class_f64_e64 vcc, v{{\[}}[[LO]]:[[HI]]], 3
; SI: v_cndmask_b32_e32 v[[RESLO:[0-9]+]], v[[MINLO]], v[[LO]], vcc
; SI: v_cndmask_b32_e32 v[[RESHI:[0-9]+]], v[[MINHI]], v[[HI]], vcc
@ -67,7 +67,7 @@ define amdgpu_kernel void @fract_f64_neg(double addrspace(1)* %out, double addrs
; SI-DAG: v_fract_f64_e64 [[FRC:v\[[0-9]+:[0-9]+\]]], -|v{{\[}}[[LO:[0-9]+]]:[[HI:[0-9]+]]]|
; SI-DAG: v_mov_b32_e32 v[[UPLO:[0-9]+]], -1
; SI-DAG: v_mov_b32_e32 v[[UPHI:[0-9]+]], 0x3fefffff
; SI-DAG: v_min_f64 v{{\[}}[[MINLO:[0-9]+]]:[[MINHI:[0-9]+]]], v{{\[}}[[UPLO]]:[[UPHI]]], [[FRC]]
; SI-DAG: v_min_f64 v{{\[}}[[MINLO:[0-9]+]]:[[MINHI:[0-9]+]]], [[FRC]], v{{\[}}[[UPLO]]:[[UPHI]]]
; SI-DAG: v_cmp_class_f64_e64 vcc, v{{\[}}[[LO]]:[[HI]]], 3
; SI: v_cndmask_b32_e32 v[[RESLO:[0-9]+]], v[[MINLO]], v[[LO]], vcc
; SI: v_cndmask_b32_e32 v[[RESHI:[0-9]+]], v[[MINHI]], v[[HI]], vcc

Some files were not shown because too many files have changed in this diff Show More