Vendor import of llvm tags/RELEASE_32/final r170710 (effectively, 3.2

release):
http://llvm.org/svn/llvm-project/llvm/tags/RELEASE_32/final@170710
This commit is contained in:
Dimitry Andric 2012-12-22 14:58:30 +00:00
parent 522600a229
commit 482e7bddf6
24 changed files with 563 additions and 353 deletions

View File

@ -29,12 +29,6 @@
<p>Written by the <a href="http://llvm.org/">LLVM Team</a></p>
</div>
<h1 style="color:red">These are in-progress notes for the upcoming LLVM 3.2
release.<br>
You may prefer the
<a href="http://llvm.org/releases/3.1/docs/ReleaseNotes.html">LLVM 3.1
Release Notes</a>.</h1>
<!-- *********************************************************************** -->
<h2>
<a name="intro">Introduction</a>
@ -46,7 +40,7 @@ Release Notes</a>.</h1>
<p>This document contains the release notes for the LLVM Compiler
Infrastructure, release 3.2. Here we describe the status of LLVM, including
major improvements from the previous release, improvements in various
subprojects of LLVM, and some of the current users of the code. All LLVM
sub-projects of LLVM, and some of the current users of the code. All LLVM
releases may be downloaded from the <a href="http://llvm.org/releases/">LLVM
releases web site</a>.</p>
@ -72,11 +66,12 @@ Release Notes</a>.</h1>
<div>
<p>The LLVM 3.2 distribution currently consists of code from the core LLVM
repository, which roughly includes the LLVM optimizers, code generators and
supporting tools, and the Clang repository. In addition to this code, the
LLVM Project includes other sub-projects that are in development. Here we
include updates on these subprojects.</p>
<p>The LLVM 3.2 distribution currently consists of production-quality code
from the core LLVM repository, which roughly includes the LLVM optimizers,
code generators and supporting tools, as well as Clang, DragonEgg and
compiler-rt sub-project repositories. In addition to this code, the LLVM
Project includes other sub-projects that are in development. Here we
include updates on these sub-projects.</p>
<!--=========================================================================-->
<h3>
@ -90,18 +85,18 @@ Release Notes</a>.</h1>
experience through expressive diagnostics, a high level of conformance to
language standards, fast compilation, and low memory use. Like LLVM, Clang
provides a modular, library-based architecture that makes it suitable for
creating or integrating with other development tools. Clang is considered a
production-quality compiler for C, Objective-C, C++ and Objective-C++ on x86
(32- and 64-bit), and for Darwin/ARM targets.</p>
creating or integrating with other development tools.</p>
<p>In the LLVM 3.2 time-frame, the Clang team has made many improvements.
Highlights include:</p>
<ul>
<li>...</li>
<li>Improvements to Clang's diagnostics</li>
<li>Support for tls_model attribute</li>
<li>Type safety attributes</li>
</ul>
<p>For more details about the changes to Clang since the 3.1 release, see the
<a href="http://clang.llvm.org/docs/ReleaseNotes.html">Clang release
<a href="http://llvm.org/releases/3.2/tools/clang/docs/ReleaseNotes.html">Clang 3.2 release
notes.</a></p>
<p>If Clang rejects your code but another compiler accepts it, please take a
@ -129,7 +124,10 @@ Release Notes</a>.</h1>
<p>The 3.2 release has the following notable changes:</p>
<ul>
<li>...</li>
<li>Able to load LLVM plugins such as Polly.</li>
<li>Supports thread-local storage models.</li>
<li>Passes knowledge of variable lifetimes to the LLVM optimizers.</li>
<li>No longer requires GCC to be built with LTO support.</li>
</ul>
</div>
@ -141,7 +139,8 @@ Release Notes</a>.</h1>
<div>
<p>The new LLVM <a href="http://compiler-rt.llvm.org/">compiler-rt project</a>
<p>The LLVM <a href="http://compiler-rt.llvm.org/">compiler-rt project</a>
is a simple library that provides an implementation of the low-level
target-specific hooks required by code generation and other runtime
components. For example, when compiling for a 32-bit target, converting a
@ -153,7 +152,11 @@ Release Notes</a>.</h1>
<p>The 3.2 release has the following notable changes:</p>
<ul>
<li>...</li>
<li><a href="http://llvm.org/releases/3.2/tools/clang/docs/ThreadSanitizer.html">ThreadSanitizer (TSan)</a> - data race detector run-time library for C/C++ has been added.</li>
<li>Improvements to <a href="http://llvm.org/releases/3.2/tools/clang/docs/AddressSanitizer.html">AddressSanitizer</a> including: better portability
(OSX, Android NDK), support for cmake based builds, enhanced error reporting and lots of bug fixes.</li>
<li>Added support for A6 'Swift' CPU.</li>
<li><code>divsi3</code> function has been enhanced to take advantage of a hardware unsigned divide when it is available.</li>
</ul>
</div>
@ -174,7 +177,9 @@ Release Notes</a>.</h1>
<p>The 3.2 release has the following notable changes:</p>
<ul>
<li>...</li>
<li>Linux build fixes for clang (see <a href="http://lldb.llvm.org/build.html">Building LLDB</a>)</li>
<li>Some Linux stability and usability improvements</li>
<li>Switch expression evaluation to use MCJIT (from legacy JIT) on Linux</li>
</ul>
</div>
@ -193,7 +198,15 @@ Release Notes</a>.</h1>
<p>Within the LLVM 3.2 time-frame there were the following highlights:</p>
<ul>
<li>...</li>
<li> C++11 shared_ptr atomic access API (20.7.2.5) has been implemented.</li>
<li>Applied noexcept and constexpr throughout library.</li>
<li>Improved C++11 conformance in associative container emplace.</li>
<li>Performance improvements in: std::rotate algorithm and I/O.</li>
<li>Operator new/delete and type_infos for exception types moved from libc++ to libc++abi.</li>
<li>Bug fixes in: <code>&lt;atomic&gt;</code>; vector<code>&lt;bool&gt;</code> algorithms,
<code>&lt;future&gt;</code>,<code>&lt;tuple&gt;</code>,
<code>&lt;type_traits&gt;</code>,<code>&lt;fstream&gt;</code>,<code>&lt;istream&gt;</code>,
<code>&lt;iterator&gt;</code>, <code>&lt;condition_variable&gt;</code>,<code>&lt;complex&gt;</code> as well as visibility fixes.
</ul>
</div>
@ -212,7 +225,7 @@ Release Notes</a>.</h1>
<p>The 3.2 release has the following notable changes:</p>
<ul>
<li>...</li>
<li>Bug fixes only, no functional changes.</li>
</ul>
</div>
@ -227,18 +240,63 @@ Release Notes</a>.</h1>
<p><a href="http://polly.llvm.org/">Polly</a> is an <em>experimental</em>
optimizer for data locality and parallelism. It currently provides high-level
loop optimizations and automatic parallelisation (using the OpenMP run time).
loop optimizations and automatic parallelization (using the OpenMP run time).
Work in the area of automatic SIMD and accelerator code generation was
started.</p>
<p>Within the LLVM 3.2 time-frame there were the following highlights:</p>
<ul>
<li>...</li>
<li>isl, the integer set library used by Polly, was relicensed under the MIT license.</li>
<li>isl based code generation.</li>
<li>MIT licensed replacement for CLooG (LGPLv2).</li>
<li>Fine grained option handling (separation of core and border computations, control overhead vs. code size).</li>
<li>Support for FORTRAN and Dragonegg.</li>
<li>OpenMP code generation fixes.</li>
</ul>
</div>
<!--=========================================================================-->
<h3>
<a name="StaticAnalyzer">Clang Static Analyzer</a>
</h3>
<div>
<p>The <a href="http://clang-analyzer.llvm.org/">Clang Static Analyzer</a>
is an advanced source code analysis tool integrated into Clang that performs
a deep analysis of code to find potential bugs.</p>
<p>In the LLVM 3.2 release, the static analyzer has made significant improvements
in many areas, with notable highlights such as:</p>
<ul>
<li>Improved interprocedural analysis within a translation unit (see details below), which greatly amplified the analyzer's ability to find bugs.</li>
<li>New infrastructure to model &quot;well-known&quot; APIs, allowing the analyzer to do a much better job when modeling calls to such functions.</li>
<li>Significant improvements to the APIs to write static analyzer checkers, with a more unified way of representing function/method calls in the checker API. Details can be found in the <a href="http://llvm.org/devmtg/2012-11#talk13">Building a Checker in 24 hours</a> talk.
</ul>
<p>The release specifically includes notable improvements for Objective-C analysis, including:</p>
<ul>
<li>Interprocedural analysis for Objective-C methods.</li>
<li>Interprocedural analysis of calls to &quot;blocks&quot;.</li>
<li>Precise modeling of GCD APIs such as <tt>dispatch_once</tt> and friends.</li>
<li>Improved support for recently added Objective-C constructs such as array and dictionary literals.</li>
</ul>
<p>The release specifically includes notable improvements for C++ analysis, including:</p>
<ul>
<li>Interprocedural analysis for C++ methods (within a translation unit).</li>
<li>More precise modeling of C++ initializers and destructors.</li>
</ul>
<p>Finally, this release includes many small improvements to <tt>scan-build</tt>, which can be used to drive the analyzer from the command line or a continuous integration system. This includes a directory-traversal issue, which could cause potential security problems in some cases. We would like to acknowledge Tim Brown of Portcullis Computer Security Ltd for reporting this issue.</p>
</div>
</div>
<!-- *********************************************************************** -->
@ -265,6 +323,19 @@ Release Notes</a>.</h1>
</div>
<h3>EmbToolkit</h3>
<div>
<p><a href="http://www.embtoolkit.org/">EmbToolkit</a> provides Linux cross-compiler
toolchain/SDK (GCC/binutils/C library (uclibc,eglibc,musl)), a build system for
package cross-compilation and optionally various root file systems.
It supports ARM and MIPS. There is an ongoing effort to provide a clang+llvm
environment for the 3.2 releases,
</p>
</div>
<h3>FAUST</h3>
<div>
@ -274,7 +345,7 @@ Release Notes</a>.</h1>
AUdio STream. Its programming model combines two approaches: functional
programming and block diagram composition. In addition with the C, C++, Java,
JavaScript output formats, the Faust compiler can generate LLVM bitcode, and
works with LLVM 2.7-3.1.</p>
works with LLVM 2.7-3.2.</p>
</div>
@ -331,7 +402,11 @@ Release Notes</a>.</h1>
<p>OSL was developed by Sony Pictures Imageworks for use in its in-house
renderer used for feature film animation and visual effects, and is
distributed as open source software with the "New BSD" license.</p>
distributed as open source software with the "New BSD" license.
It has been used for all the shading on such films as The Amazing Spider-Man,
Men in Black III, Hotel Transylvania, and may other films in-progress,
and also has been incorporated into several commercial and open source
rendering products such as Blender, VRay, and Autodesk Beast.</p>
</div>
@ -367,7 +442,7 @@ Release Notes</a>.</h1>
C++, Fortran and Faust code in Pure programs if the corresponding
LLVM-enabled compilers are installed).</p>
<p>Pure version 0.54 has been tested and is known to work with LLVM 3.1 (and
<p>Pure version 0.56 has been tested and is known to work with LLVM 3.2 (and
continues to work with older LLVM releases >= 2.5).</p>
</div>
@ -432,7 +507,9 @@ Release Notes</a>.</h1>
<p>LLVM 3.2 includes several major changes and big features:</p>
<ul>
<li>...</li>
<li>Loop Vectorizer.</li>
<li>New implementation of SROA.</li>
<li>New NVPTX back-end (replacing existing PTX back-end) based on NVIDIA sources.</li>
</ul>
</div>
@ -451,7 +528,10 @@ Release Notes</a>.</h1>
<ul>
<li>Thread local variables may have a specified TLS model. See the
<a href="LangRef.html#globalvars">Language Reference Manual</a>.</li>
<li>...</li>
<li>'TYPE_CODE_FUNCTION_OLD' type code and autoupgrade code for old function attributes format has been removed.</li>
<li>Internal representation of the Attributes class has been converted into a pointer to an
opaque object that's uniqued by and stored in the LLVMContext object.
The Attributes class then becomes a thin wrapper around this opaque object.</li>
</ul>
</div>
@ -489,23 +569,33 @@ Release Notes</a>.</h1>
<ul>
<li>The inner most loops must have a single basic block.</li>
<li>The number of iterations are known before the loop starts to execute.</li>
<li>The loop counter needs to be incrimented by one.</li>
<li>The loop counter needs to be incremented by one.</li>
<li>The loop trip count <b>can</b> be a variable.</li>
<li>Loops do <b>not</b> need to start at zero.</li>
<li>The induction variable can be used inside the loop.</li>
<li>Loop reductions are supported.</li>
<li>Arrays with affine access pattern do <b>not</b> need to be marked as 'noalias' and are checked at runtime.</li>
<li>...</li>
</ul>
</p>
<p>SROA - We've re-written SROA to be significantly more powerful.
<!-- FIXME: Add more text here... --></p>
<p>SROA - We&#8217;ve re-written SROA to be significantly more powerful and generate
code which is much more friendly to the rest of the optimization pipeline.
Previously this pass had scaling problems that required it to only operate on
relatively small aggregates, and at times it would mistakenly replace a large
aggregate with a single very large integer in order to make it a scalar SSA
value. The result was a large number of i1024 and i2048 values representing any
small stack buffer. These in turn slowed down many subsequent optimization
paths.</p>
<p>The new SROA pass uses a different algorithm that allows it to only promote to
scalars the pieces of the aggregate actively in use. Because of this it doesn&#8217;t
require any thresholds. It also always deduces the scalar values from the uses
of the aggregate rather than the specific LLVM type of the aggregate. These
features combine to both optimize more code with the pass but to improve the
compile time of many functions dramatically.</p>
<ul>
<li>Branch weight metadata is preseved through more of the optimizer.</li>
<li>...</li>
<li>Branch weight metadata is preserved through more of the optimizer.</li>
</ul>
</div>
@ -524,8 +614,19 @@ Release Notes</a>.</h1>
<a href="http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html">Intro
to the LLVM MC Project Blog Post</a>.</p>
<ul>
<li>...</li>
<ul>
<li> Added support for following assembler directives: <code>.ifb</code>, <code>.ifnb</code>, <code>.ifc</code>,
<code>.ifnc</code>, <code>.purgem</code>, <code>.rept</code> and <code>.version</code> (ELF) as well as Darwin specific
<code>.pushsection</code>, <code>.popsection</code> and <code>.previous</code> .</li>
<li>Enhanced handling of <code>.lcomm directive</code>.</li>
<li>MS style inline assembler: added implementation of the offset and TYPE operators.</li>
<li>Targets can specify minimum supported NOP size for NOP padding.</li>
<li>ELF improvements: added support for generating ELF objects on Windows.</li>
<li>MachO improvements: symbol-difference variables are marked as N_ABS, added direct-to-object attribute for data-in-code markers.</li>
<li>Added support for annotated disassembly output for x86 and arm targets.</li>
<li>Arm support has been improved by adding support for ARM TARGET2 relocation
and fixing hadling of ARM-style "$d.*" labels.</li>
<li>Implemented local-exec TLS on PowerPC.</li>
</ul>
</div>
@ -550,10 +651,6 @@ Release Notes</a>.</h1>
infrastructure, which allows us to implement more aggressive algorithms and
make it run faster:</p>
<ul>
<li>...</li>
</ul>
<p> We added new TableGen infrastructure to support bundling for
Very Long Instruction Word (VLIW) architectures. TableGen can now
automatically generate a deterministic finite automaton from a VLIW
@ -563,6 +660,13 @@ Release Notes</a>.</h1>
<p> We have added a new target independent VLIW packetizer based on the
DFA infrastructure to group machine instructions into bundles.</p>
<p> We have added new TableGen infrastructure to support relationship maps
between instructions. This feature enables TableGen to automatically
construct a set of relation tables and query functions that can be used
to switch between various forms of instructions. For more information,
please refer to <a href="http://llvm.org/docs/HowToUseInstrMappings.html">
How To Use Instruction Mappings</a>.</p>
</div>
<h4>
@ -588,7 +692,7 @@ Release Notes</a>.</h1>
<p>New features and major changes in the X86 target include:</p>
<ul>
<li>...</li>
<li>Small codegen optimizations, especially for AVX2.</li>
</ul>
</div>
@ -603,7 +707,7 @@ Release Notes</a>.</h1>
<p>New features of the ARM target include:</p>
<ul>
<li>...</li>
<li>Support and performance tuning for the A6 'Swift' CPU.</li>
</ul>
<!--_________________________________________________________________________-->
@ -620,7 +724,7 @@ Release Notes</a>.</h1>
platform specific support for Linux.</p>
<p>Full support is included for Thumb1, Thumb2 and ARM modes, along with
subtarget and CPU specific extensions for VFP2, VFP3 and NEON.</p>
sub-target and CPU specific extensions for VFP2, VFP3 and NEON.</p>
<p>The assembler is Unified Syntax only (see ARM Architecural Reference Manual
for details). While there is some, and growing, support for pre-unfied
@ -640,7 +744,29 @@ Release Notes</a>.</h1>
<p>New features and major changes in the MIPS target include:</p>
<ul>
<li>...</li>
<li>Integrated assembler support:
MIPS32 works for both PIC and static, known limitation is the PR14456 where
R_MIPS_GPREL16 relocation is generated with the wrong addend.
MIPS64 support is incomplete, for example exception handling is not working.</li>
<li>Support for fast calling convention has been added.</li>
<li>Support for Android MIPS toolchain has been added to clang driver.</li>
<li>Added clang driver support for MIPS N32 ABI through "-mabi=n32" option.</li>
<li>MIPS32 and MIPS64 disassembler has been implemented.</li>
<li>Support for compiling programs with large GOTs (exceeding 64kB in size) has been added
through llc option "-mxgot".</li>
<li>Added experimental support for MIPS32 DSP intrinsics.</li>
<li>Experimental support for MIPS16 with following limitations: only soft float is supported,
C++ exceptions are not supported, large stack frames (> 32000 bytes) are not supported,
direct object code emission is not supported only .s .</li>
<li>Standalone assembler (llvm-mc): implementation is in progress and considered experimental.</li>
<li>All classic JIT and MCJIT tests pass on Little and Big Endian MIPS32 platforms.</li>
<li>Inline asm support: all common constraints and operand modifiers have been implemented.</li>
<li>Added tail call optimization support, use llc option "-enable-mips-tail-calls"
or clang options "-mllvm -enable-mips-tail-calls"to enable it.</li>
<li>Improved register allocation by removing registers $fp, $gp, $ra and $at from the list of reserved registers.</li>
<li>Long branch expansion pass has been implemented, which expands branch
instructions with offsets that do not fit in the 16-bit field.</li>
<li>Cavium Octeon II board is used for testing builds (llvm-mips-linux builder).</li>
</ul>
</div>
@ -652,7 +778,6 @@ Release Notes</a>.</h1>
<div>
<ul>
<p>Many fixes and changes across LLVM (and Clang) for better compliance with
the 64-bit PowerPC ELF Application Binary Interface, interoperability with
GCC, and overall 64-bit PowerPC support. Some highlights include:</p>
@ -681,8 +806,28 @@ Release Notes</a>.</h1>
<p>There have also been code generation improvements for both 32- and 64-bit
code. Instruction scheduling support for the Freescale e500mc and e5500
cores has been added.</p>
</div>
<!--=========================================================================-->
<h3>
<a name="NVPTX">PTX/NVPTX Target Improvements</a>
</h3>
<div>
<p>The PTX back-end has been replaced by the NVPTX back-end, which is based on
the LLVM back-end used by NVIDIA in their CUDA (nvcc) and OpenCL compiler.
Some highlights include:</p>
<ul>
<li>Compatibility with PTX 3.1 and SM 3.5</li>
<li>Support for NVVM intrinsics as defined in the NVIDIA Compiler SDK</li>
<li>Full compatibility with old PTX back-end, with much greater coverage of
LLVM IR</li>
</ul>
<p>Please submit any back-end bugs to the LLVM Bugzilla site.</p>
</div>
<!--=========================================================================-->
@ -693,7 +838,7 @@ Release Notes</a>.</h1>
<div>
<ul>
<li>...</li>
<li>Added support for custom names for library functions in TargetLibraryInfo.</li>
</ul>
</div>
@ -710,9 +855,11 @@ Release Notes</a>.</h1>
from the previous release.</p>
<ul>
<li>...</li>
</ul>
<li>llvm-ld and llvm-stub have been removed, llvm-ld functionality can be partially replaced by
llvm-link | opt | {llc | as, llc -filetype=obj} | ld, or fully replaced by Clang. </li>
<li>MCJIT: added support for inline assembly (requires asm parser), added faux remote target execution to lli option '-remote-mcjit'.</li>
</ul>
</div>
<!--=========================================================================-->
@ -733,10 +880,6 @@ Release Notes</a>.</h1>
<p> The TargetData structure has been renamed to DataLayout and moved to VMCore
to remove a dependency on Target. </p>
<ul>
<li>...</li>
</ul>
</div>
<!--=========================================================================-->
@ -746,34 +889,23 @@ to remove a dependency on Target. </p>
<div>
<p>In addition, some tools have changed in this release. Some of the changes
are:</p>
<p>In addition, some tools have changed in this release. Some of the changes are:</p>
<ul>
<li>...</li>
<li>opt: added support for '-mtriple' option.</li>
<li>llvm-mc : - added '-disassemble' support for '-show-inst' and '-show-encoding' options, added '-edis' option to produce annotated
disassembly output for X86 and ARM targets.</li>
<li>libprofile: allows the profile data file name to be specified by the LLVMPROF_OUTPUT environment variable.</li>
<li>llvm-objdump: has been changed to display available targets, '-arch' option accepts x86 and x86-64 as valid arch names.</li>
<li>llc and opt: added FMA formation from pairs of FADD + FMUL or FSUB + FMUL enabled by option '-enable-excess-fp-precision' or option '-enable-unsafe-fp-math',
option '-fp-contract' controls the creation by optimizations of fused FP by selecting Fast, Standard, or Strict mode.</li>
<li>llc: object file output from llc is no longer considered experimental.</li>
<li>gold plugin: handles Position Independent Executables.</li>
</ul>
</div>
<!--=========================================================================-->
<h3>
<a name="python">Python Bindings</a>
</h3>
<div>
<p>Officially supported Python bindings have been added! Feature support is far
from complete. The current bindings support interfaces to:</p>
<ul>
<li>...</li>
</ul>
</div>
</div>
<!-- *********************************************************************** -->
<h2>
<a name="knownproblems">Known Problems</a>
@ -794,7 +926,7 @@ to remove a dependency on Target. </p>
<p>Known problem areas include:</p>
<ul>
<li>The CellSPU, MSP430, PTX and XCore backends are experimental.</li>
<li>The CellSPU, MSP430, and XCore backends are experimental, and the CellSPU backend will be removed in LLVM 3.3.</li>
<li>The integrated assembler, disassembler, and JIT is not supported by
several targets. If an integrated assembler is not supported, then a
@ -836,7 +968,7 @@ to remove a dependency on Target. </p>
src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a>
<a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br>
Last modified: $Date: 2012-11-20 05:22:44 +0100 (Tue, 20 Nov 2012) $
Last modified: $Date: 2012-12-19 11:50:28 +0100 (Wed, 19 Dec 2012) $
</address>
</body>

View File

@ -197,7 +197,11 @@ public:
VK_Mips_GOT_PAGE,
VK_Mips_GOT_OFST,
VK_Mips_HIGHER,
VK_Mips_HIGHEST
VK_Mips_HIGHEST,
VK_Mips_GOT_HI16,
VK_Mips_GOT_LO16,
VK_Mips_CALL_HI16,
VK_Mips_CALL_LO16
};
private:

View File

@ -346,7 +346,7 @@ uint8_t *RuntimeDyldImpl::createStubFunction(uint8_t *Addr) {
uint32_t *StubAddr = (uint32_t*)Addr;
*StubAddr = 0xe51ff004; // ldr pc,<label>
return (uint8_t*)++StubAddr;
} else if (Arch == Triple::mipsel) {
} else if (Arch == Triple::mipsel || Arch == Triple::mips) {
uint32_t *StubAddr = (uint32_t*)Addr;
// 0: 3c190000 lui t9,%hi(addr).
// 4: 27390000 addiu t9,t9,%lo(addr).

View File

@ -676,7 +676,8 @@ void RuntimeDyldELF::processRelocationRef(const ObjRelocationInfo &Rel,
RelType, 0);
Section.StubOffset += getMaxStubSize();
}
} else if (Arch == Triple::mipsel && RelType == ELF::R_MIPS_26) {
} else if ((Arch == Triple::mipsel || Arch == Triple::mips) &&
RelType == ELF::R_MIPS_26) {
// This is an Mips branch relocation, need to use a stub function.
DEBUG(dbgs() << "\t\tThis is a Mips branch relocation.");
SectionEntry &Section = Sections[Rel.SectionID];

View File

@ -168,7 +168,7 @@ protected:
inline unsigned getMaxStubSize() {
if (Arch == Triple::arm || Arch == Triple::thumb)
return 8; // 32-bit instruction and 32-bit address
else if (Arch == Triple::mipsel)
else if (Arch == Triple::mipsel || Arch == Triple::mips)
return 16;
else if (Arch == Triple::ppc64)
return 44;

View File

@ -229,6 +229,10 @@ StringRef MCSymbolRefExpr::getVariantKindName(VariantKind Kind) {
case VK_Mips_GOT_OFST: return "GOT_OFST";
case VK_Mips_HIGHER: return "HIGHER";
case VK_Mips_HIGHEST: return "HIGHEST";
case VK_Mips_GOT_HI16: return "GOT_HI16";
case VK_Mips_GOT_LO16: return "GOT_LO16";
case VK_Mips_CALL_HI16: return "CALL_HI16";
case VK_Mips_CALL_LO16: return "CALL_LO16";
}
llvm_unreachable("Invalid variant kind");
}

View File

@ -128,6 +128,10 @@ static void printExpr(const MCExpr *Expr, raw_ostream &OS) {
case MCSymbolRefExpr::VK_Mips_GOT_OFST: OS << "%got_ofst("; break;
case MCSymbolRefExpr::VK_Mips_HIGHER: OS << "%higher("; break;
case MCSymbolRefExpr::VK_Mips_HIGHEST: OS << "%highest("; break;
case MCSymbolRefExpr::VK_Mips_GOT_HI16: OS << "%got_hi("; break;
case MCSymbolRefExpr::VK_Mips_GOT_LO16: OS << "%got_lo("; break;
case MCSymbolRefExpr::VK_Mips_CALL_HI16: OS << "%call_hi("; break;
case MCSymbolRefExpr::VK_Mips_CALL_LO16: OS << "%call_lo("; break;
}
OS << SRE->getSymbol();

View File

@ -42,6 +42,8 @@ static unsigned adjustFixupValue(unsigned Kind, uint64_t Value) {
case Mips::fixup_Mips_GOT_PAGE:
case Mips::fixup_Mips_GOT_OFST:
case Mips::fixup_Mips_GOT_DISP:
case Mips::fixup_Mips_GOT_LO16:
case Mips::fixup_Mips_CALL_LO16:
break;
case Mips::fixup_Mips_PC16:
// So far we are only using this type for branches.
@ -60,6 +62,8 @@ static unsigned adjustFixupValue(unsigned Kind, uint64_t Value) {
break;
case Mips::fixup_Mips_HI16:
case Mips::fixup_Mips_GOT_Local:
case Mips::fixup_Mips_GOT_HI16:
case Mips::fixup_Mips_CALL_HI16:
// Get the 2nd 16-bits. Also add 1 if bit 15 is 1.
Value = ((Value + 0x8000) >> 16) & 0xffff;
break;
@ -179,7 +183,11 @@ public:
{ "fixup_Mips_GOT_OFST", 0, 16, 0 },
{ "fixup_Mips_GOT_DISP", 0, 16, 0 },
{ "fixup_Mips_HIGHER", 0, 16, 0 },
{ "fixup_Mips_HIGHEST", 0, 16, 0 }
{ "fixup_Mips_HIGHEST", 0, 16, 0 },
{ "fixup_Mips_GOT_HI16", 0, 16, 0 },
{ "fixup_Mips_GOT_LO16", 0, 16, 0 },
{ "fixup_Mips_CALL_HI16", 0, 16, 0 },
{ "fixup_Mips_CALL_LO16", 0, 16, 0 }
};
if (Kind < FirstTargetFixupKind)

View File

@ -84,7 +84,13 @@ namespace MipsII {
/// MO_HIGHER/HIGHEST - Represents the highest or higher half word of a
/// 64-bit symbol address.
MO_HIGHER,
MO_HIGHEST
MO_HIGHEST,
/// MO_GOT_HI16/LO16, MO_CALL_HI16/LO16 - Relocations used for large GOTs.
MO_GOT_HI16,
MO_GOT_LO16,
MO_CALL_HI16,
MO_CALL_LO16
};
enum {

View File

@ -179,6 +179,18 @@ unsigned MipsELFObjectWriter::GetRelocType(const MCValue &Target,
case Mips::fixup_Mips_HIGHEST:
Type = ELF::R_MIPS_HIGHEST;
break;
case Mips::fixup_Mips_GOT_HI16:
Type = ELF::R_MIPS_GOT_HI16;
break;
case Mips::fixup_Mips_GOT_LO16:
Type = ELF::R_MIPS_GOT_LO16;
break;
case Mips::fixup_Mips_CALL_HI16:
Type = ELF::R_MIPS_CALL_HI16;
break;
case Mips::fixup_Mips_CALL_LO16:
Type = ELF::R_MIPS_CALL_LO16;
break;
}
return Type;
}

View File

@ -116,6 +116,18 @@ namespace Mips {
// resulting in - R_MIPS_HIGHEST
fixup_Mips_HIGHEST,
// resulting in - R_MIPS_GOT_HI16
fixup_Mips_GOT_HI16,
// resulting in - R_MIPS_GOT_LO16
fixup_Mips_GOT_LO16,
// resulting in - R_MIPS_CALL_HI16
fixup_Mips_CALL_HI16,
// resulting in - R_MIPS_CALL_LO16
fixup_Mips_CALL_LO16,
// Marker
LastTargetFixupKind,
NumTargetFixupKinds = LastTargetFixupKind - FirstTargetFixupKind

View File

@ -287,6 +287,18 @@ getMachineOpValue(const MCInst &MI, const MCOperand &MO,
case MCSymbolRefExpr::VK_Mips_HIGHEST:
FixupKind = Mips::fixup_Mips_HIGHEST;
break;
case MCSymbolRefExpr::VK_Mips_GOT_HI16:
FixupKind = Mips::fixup_Mips_GOT_HI16;
break;
case MCSymbolRefExpr::VK_Mips_GOT_LO16:
FixupKind = Mips::fixup_Mips_GOT_LO16;
break;
case MCSymbolRefExpr::VK_Mips_CALL_HI16:
FixupKind = Mips::fixup_Mips_CALL_HI16;
break;
case MCSymbolRefExpr::VK_Mips_CALL_LO16:
FixupKind = Mips::fixup_Mips_CALL_LO16;
break;
} // switch
Fixups.push_back(MCFixup::Create(0, MO.getExpr(), MCFixupKind(FixupKind)));

View File

@ -255,6 +255,7 @@ def : MipsPat<(MipsHi tblockaddress:$in), (LUi64 tblockaddress:$in)>;
def : MipsPat<(MipsHi tjumptable:$in), (LUi64 tjumptable:$in)>;
def : MipsPat<(MipsHi tconstpool:$in), (LUi64 tconstpool:$in)>;
def : MipsPat<(MipsHi tglobaltlsaddr:$in), (LUi64 tglobaltlsaddr:$in)>;
def : MipsPat<(MipsHi texternalsym:$in), (LUi64 texternalsym:$in)>;
def : MipsPat<(MipsLo tglobaladdr:$in), (DADDiu ZERO_64, tglobaladdr:$in)>;
def : MipsPat<(MipsLo tblockaddress:$in), (DADDiu ZERO_64, tblockaddress:$in)>;
@ -262,6 +263,7 @@ def : MipsPat<(MipsLo tjumptable:$in), (DADDiu ZERO_64, tjumptable:$in)>;
def : MipsPat<(MipsLo tconstpool:$in), (DADDiu ZERO_64, tconstpool:$in)>;
def : MipsPat<(MipsLo tglobaltlsaddr:$in),
(DADDiu ZERO_64, tglobaltlsaddr:$in)>;
def : MipsPat<(MipsLo texternalsym:$in), (DADDiu ZERO_64, texternalsym:$in)>;
def : MipsPat<(add CPU64Regs:$hi, (MipsLo tglobaladdr:$lo)),
(DADDiu CPU64Regs:$hi, tglobaladdr:$lo)>;

View File

@ -85,7 +85,7 @@ class MipsCodeEmitter : public MachineFunctionPass {
private:
void emitWordLE(unsigned Word);
void emitWord(unsigned Word);
/// Routines that handle operands which add machine relocations which are
/// fixed up by the relocation stage.
@ -112,12 +112,6 @@ class MipsCodeEmitter : public MachineFunctionPass {
unsigned getSizeExtEncoding(const MachineInstr &MI, unsigned OpNo) const;
unsigned getSizeInsEncoding(const MachineInstr &MI, unsigned OpNo) const;
int emitULW(const MachineInstr &MI);
int emitUSW(const MachineInstr &MI);
int emitULH(const MachineInstr &MI);
int emitULHu(const MachineInstr &MI);
int emitUSH(const MachineInstr &MI);
void emitGlobalAddressUnaligned(const GlobalValue *GV, unsigned Reloc,
int Offset) const;
};
@ -133,7 +127,7 @@ bool MipsCodeEmitter::runOnMachineFunction(MachineFunction &MF) {
MCPEs = &MF.getConstantPool()->getConstants();
MJTEs = 0;
if (MF.getJumpTableInfo()) MJTEs = &MF.getJumpTableInfo()->getJumpTables();
JTI->Initialize(MF, IsPIC);
JTI->Initialize(MF, IsPIC, Subtarget->isLittle());
MCE.setModuleInfo(&getAnalysis<MachineModuleInfo> ());
do {
@ -271,103 +265,6 @@ void MipsCodeEmitter::emitMachineBasicBlock(MachineBasicBlock *BB,
Reloc, BB));
}
int MipsCodeEmitter::emitUSW(const MachineInstr &MI) {
unsigned src = getMachineOpValue(MI, MI.getOperand(0));
unsigned base = getMachineOpValue(MI, MI.getOperand(1));
unsigned offset = getMachineOpValue(MI, MI.getOperand(2));
// swr src, offset(base)
// swl src, offset+3(base)
MCE.emitWordLE(
(0x2e << 26) | (base << 21) | (src << 16) | (offset & 0xffff));
MCE.emitWordLE(
(0x2a << 26) | (base << 21) | (src << 16) | ((offset+3) & 0xffff));
return 2;
}
int MipsCodeEmitter::emitULW(const MachineInstr &MI) {
unsigned dst = getMachineOpValue(MI, MI.getOperand(0));
unsigned base = getMachineOpValue(MI, MI.getOperand(1));
unsigned offset = getMachineOpValue(MI, MI.getOperand(2));
unsigned at = 1;
if (dst != base) {
// lwr dst, offset(base)
// lwl dst, offset+3(base)
MCE.emitWordLE(
(0x26 << 26) | (base << 21) | (dst << 16) | (offset & 0xffff));
MCE.emitWordLE(
(0x22 << 26) | (base << 21) | (dst << 16) | ((offset+3) & 0xffff));
return 2;
} else {
// lwr at, offset(base)
// lwl at, offset+3(base)
// addu dst, at, $zero
MCE.emitWordLE(
(0x26 << 26) | (base << 21) | (at << 16) | (offset & 0xffff));
MCE.emitWordLE(
(0x22 << 26) | (base << 21) | (at << 16) | ((offset+3) & 0xffff));
MCE.emitWordLE(
(0x0 << 26) | (at << 21) | (0x0 << 16) | (dst << 11) | (0x0 << 6) | 0x21);
return 3;
}
}
int MipsCodeEmitter::emitUSH(const MachineInstr &MI) {
unsigned src = getMachineOpValue(MI, MI.getOperand(0));
unsigned base = getMachineOpValue(MI, MI.getOperand(1));
unsigned offset = getMachineOpValue(MI, MI.getOperand(2));
unsigned at = 1;
// sb src, offset(base)
// srl at,src,8
// sb at, offset+1(base)
MCE.emitWordLE(
(0x28 << 26) | (base << 21) | (src << 16) | (offset & 0xffff));
MCE.emitWordLE(
(0x0 << 26) | (0x0 << 21) | (src << 16) | (at << 11) | (0x8 << 6) | 0x2);
MCE.emitWordLE(
(0x28 << 26) | (base << 21) | (at << 16) | ((offset+1) & 0xffff));
return 3;
}
int MipsCodeEmitter::emitULH(const MachineInstr &MI) {
unsigned dst = getMachineOpValue(MI, MI.getOperand(0));
unsigned base = getMachineOpValue(MI, MI.getOperand(1));
unsigned offset = getMachineOpValue(MI, MI.getOperand(2));
unsigned at = 1;
// lbu at, offset(base)
// lb dst, offset+1(base)
// sll dst,dst,8
// or dst,dst,at
MCE.emitWordLE(
(0x24 << 26) | (base << 21) | (at << 16) | (offset & 0xffff));
MCE.emitWordLE(
(0x20 << 26) | (base << 21) | (dst << 16) | ((offset+1) & 0xffff));
MCE.emitWordLE(
(0x0 << 26) | (0x0 << 21) | (dst << 16) | (dst << 11) | (0x8 << 6) | 0x0);
MCE.emitWordLE(
(0x0 << 26) | (dst << 21) | (at << 16) | (dst << 11) | (0x0 << 6) | 0x25);
return 4;
}
int MipsCodeEmitter::emitULHu(const MachineInstr &MI) {
unsigned dst = getMachineOpValue(MI, MI.getOperand(0));
unsigned base = getMachineOpValue(MI, MI.getOperand(1));
unsigned offset = getMachineOpValue(MI, MI.getOperand(2));
unsigned at = 1;
// lbu at, offset(base)
// lbu dst, offset+1(base)
// sll dst,dst,8
// or dst,dst,at
MCE.emitWordLE(
(0x24 << 26) | (base << 21) | (at << 16) | (offset & 0xffff));
MCE.emitWordLE(
(0x24 << 26) | (base << 21) | (dst << 16) | ((offset+1) & 0xffff));
MCE.emitWordLE(
(0x0 << 26) | (0x0 << 21) | (dst << 16) | (dst << 11) | (0x8 << 6) | 0x0);
MCE.emitWordLE(
(0x0 << 26) | (dst << 21) | (at << 16) | (dst << 11) | (0x0 << 6) | 0x25);
return 4;
}
void MipsCodeEmitter::emitInstruction(const MachineInstr &MI) {
DEBUG(errs() << "JIT: " << (void*)MCE.getCurrentPCValue() << ":\t" << MI);
@ -377,16 +274,19 @@ void MipsCodeEmitter::emitInstruction(const MachineInstr &MI) {
if ((MI.getDesc().TSFlags & MipsII::FormMask) == MipsII::Pseudo)
return;
emitWordLE(getBinaryCodeForInstr(MI));
emitWord(getBinaryCodeForInstr(MI));
++NumEmitted; // Keep track of the # of mi's emitted
MCE.processDebugLoc(MI.getDebugLoc(), false);
}
void MipsCodeEmitter::emitWordLE(unsigned Word) {
void MipsCodeEmitter::emitWord(unsigned Word) {
DEBUG(errs() << " 0x";
errs().write_hex(Word) << "\n");
MCE.emitWordLE(Word);
if (Subtarget->isLittle())
MCE.emitWordLE(Word);
else
MCE.emitWordBE(Word);
}
/// createMipsJITCodeEmitterPass - Return a pass that emits the collected Mips

View File

@ -46,6 +46,10 @@ static cl::opt<bool>
EnableMipsTailCalls("enable-mips-tail-calls", cl::Hidden,
cl::desc("MIPS: Enable tail calls."), cl::init(false));
static cl::opt<bool>
LargeGOT("mxgot", cl::Hidden,
cl::desc("MIPS: Enable GOT larger than 64k."), cl::init(false));
static const uint16_t O32IntRegs[4] = {
Mips::A0, Mips::A1, Mips::A2, Mips::A3
};
@ -77,6 +81,71 @@ static SDValue GetGlobalReg(SelectionDAG &DAG, EVT Ty) {
return DAG.getRegister(FI->getGlobalBaseReg(), Ty);
}
static SDValue getTargetNode(SDValue Op, SelectionDAG &DAG, unsigned Flag) {
EVT Ty = Op.getValueType();
if (GlobalAddressSDNode *N = dyn_cast<GlobalAddressSDNode>(Op))
return DAG.getTargetGlobalAddress(N->getGlobal(), Op.getDebugLoc(), Ty, 0,
Flag);
if (ExternalSymbolSDNode *N = dyn_cast<ExternalSymbolSDNode>(Op))
return DAG.getTargetExternalSymbol(N->getSymbol(), Ty, Flag);
if (BlockAddressSDNode *N = dyn_cast<BlockAddressSDNode>(Op))
return DAG.getTargetBlockAddress(N->getBlockAddress(), Ty, 0, Flag);
if (JumpTableSDNode *N = dyn_cast<JumpTableSDNode>(Op))
return DAG.getTargetJumpTable(N->getIndex(), Ty, Flag);
if (ConstantPoolSDNode *N = dyn_cast<ConstantPoolSDNode>(Op))
return DAG.getTargetConstantPool(N->getConstVal(), Ty, N->getAlignment(),
N->getOffset(), Flag);
llvm_unreachable("Unexpected node type.");
return SDValue();
}
static SDValue getAddrNonPIC(SDValue Op, SelectionDAG &DAG) {
DebugLoc DL = Op.getDebugLoc();
EVT Ty = Op.getValueType();
SDValue Hi = getTargetNode(Op, DAG, MipsII::MO_ABS_HI);
SDValue Lo = getTargetNode(Op, DAG, MipsII::MO_ABS_LO);
return DAG.getNode(ISD::ADD, DL, Ty,
DAG.getNode(MipsISD::Hi, DL, Ty, Hi),
DAG.getNode(MipsISD::Lo, DL, Ty, Lo));
}
static SDValue getAddrLocal(SDValue Op, SelectionDAG &DAG, bool HasMips64) {
DebugLoc DL = Op.getDebugLoc();
EVT Ty = Op.getValueType();
unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT;
SDValue GOT = DAG.getNode(MipsISD::Wrapper, DL, Ty, GetGlobalReg(DAG, Ty),
getTargetNode(Op, DAG, GOTFlag));
SDValue Load = DAG.getLoad(Ty, DL, DAG.getEntryNode(), GOT,
MachinePointerInfo::getGOT(), false, false, false,
0);
unsigned LoFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO;
SDValue Lo = DAG.getNode(MipsISD::Lo, DL, Ty, getTargetNode(Op, DAG, LoFlag));
return DAG.getNode(ISD::ADD, DL, Ty, Load, Lo);
}
static SDValue getAddrGlobal(SDValue Op, SelectionDAG &DAG, unsigned Flag) {
DebugLoc DL = Op.getDebugLoc();
EVT Ty = Op.getValueType();
SDValue Tgt = DAG.getNode(MipsISD::Wrapper, DL, Ty, GetGlobalReg(DAG, Ty),
getTargetNode(Op, DAG, Flag));
return DAG.getLoad(Ty, DL, DAG.getEntryNode(), Tgt,
MachinePointerInfo::getGOT(), false, false, false, 0);
}
static SDValue getAddrGlobalLargeGOT(SDValue Op, SelectionDAG &DAG,
unsigned HiFlag, unsigned LoFlag) {
DebugLoc DL = Op.getDebugLoc();
EVT Ty = Op.getValueType();
SDValue Hi = DAG.getNode(MipsISD::Hi, DL, Ty, getTargetNode(Op, DAG, HiFlag));
Hi = DAG.getNode(ISD::ADD, DL, Ty, Hi, GetGlobalReg(DAG, Ty));
SDValue Wrapper = DAG.getNode(MipsISD::Wrapper, DL, Ty, Hi,
getTargetNode(Op, DAG, LoFlag));
return DAG.getLoad(Ty, DL, DAG.getEntryNode(), Wrapper,
MachinePointerInfo::getGOT(), false, false, false, 0);
}
const char *MipsTargetLowering::getTargetNodeName(unsigned Opcode) const {
switch (Opcode) {
case MipsISD::JmpLink: return "MipsISD::JmpLink";
@ -1743,8 +1812,6 @@ SDValue MipsTargetLowering::LowerGlobalAddress(SDValue Op,
const GlobalValue *GV = cast<GlobalAddressSDNode>(Op)->getGlobal();
if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64) {
SDVTList VTs = DAG.getVTList(MVT::i32);
const MipsTargetObjectFile &TLOF =
(const MipsTargetObjectFile&)getObjFileLowering();
@ -1752,69 +1819,33 @@ SDValue MipsTargetLowering::LowerGlobalAddress(SDValue Op,
if (TLOF.IsGlobalInSmallSection(GV, getTargetMachine())) {
SDValue GA = DAG.getTargetGlobalAddress(GV, dl, MVT::i32, 0,
MipsII::MO_GPREL);
SDValue GPRelNode = DAG.getNode(MipsISD::GPRel, dl, VTs, &GA, 1);
SDValue GPRelNode = DAG.getNode(MipsISD::GPRel, dl,
DAG.getVTList(MVT::i32), &GA, 1);
SDValue GPReg = DAG.getRegister(Mips::GP, MVT::i32);
return DAG.getNode(ISD::ADD, dl, MVT::i32, GPReg, GPRelNode);
}
// %hi/%lo relocation
SDValue GAHi = DAG.getTargetGlobalAddress(GV, dl, MVT::i32, 0,
MipsII::MO_ABS_HI);
SDValue GALo = DAG.getTargetGlobalAddress(GV, dl, MVT::i32, 0,
MipsII::MO_ABS_LO);
SDValue HiPart = DAG.getNode(MipsISD::Hi, dl, VTs, &GAHi, 1);
SDValue Lo = DAG.getNode(MipsISD::Lo, dl, MVT::i32, GALo);
return DAG.getNode(ISD::ADD, dl, MVT::i32, HiPart, Lo);
return getAddrNonPIC(Op, DAG);
}
EVT ValTy = Op.getValueType();
bool HasGotOfst = (GV->hasInternalLinkage() ||
(GV->hasLocalLinkage() && !isa<Function>(GV)));
unsigned GotFlag = HasMips64 ?
(HasGotOfst ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT_DISP) :
(HasGotOfst ? MipsII::MO_GOT : MipsII::MO_GOT16);
SDValue GA = DAG.getTargetGlobalAddress(GV, dl, ValTy, 0, GotFlag);
GA = DAG.getNode(MipsISD::Wrapper, dl, ValTy, GetGlobalReg(DAG, ValTy), GA);
SDValue ResNode = DAG.getLoad(ValTy, dl, DAG.getEntryNode(), GA,
MachinePointerInfo(), false, false, false, 0);
// On functions and global targets not internal linked only
// a load from got/GP is necessary for PIC to work.
if (!HasGotOfst)
return ResNode;
SDValue GALo = DAG.getTargetGlobalAddress(GV, dl, ValTy, 0,
HasMips64 ? MipsII::MO_GOT_OFST :
MipsII::MO_ABS_LO);
SDValue Lo = DAG.getNode(MipsISD::Lo, dl, ValTy, GALo);
return DAG.getNode(ISD::ADD, dl, ValTy, ResNode, Lo);
if (GV->hasInternalLinkage() || (GV->hasLocalLinkage() && !isa<Function>(GV)))
return getAddrLocal(Op, DAG, HasMips64);
if (LargeGOT)
return getAddrGlobalLargeGOT(Op, DAG, MipsII::MO_GOT_HI16,
MipsII::MO_GOT_LO16);
return getAddrGlobal(Op, DAG,
HasMips64 ? MipsII::MO_GOT_DISP : MipsII::MO_GOT16);
}
SDValue MipsTargetLowering::LowerBlockAddress(SDValue Op,
SelectionDAG &DAG) const {
const BlockAddress *BA = cast<BlockAddressSDNode>(Op)->getBlockAddress();
// FIXME there isn't actually debug info here
DebugLoc dl = Op.getDebugLoc();
if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64)
return getAddrNonPIC(Op, DAG);
if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64) {
// %hi/%lo relocation
SDValue BAHi =
DAG.getTargetBlockAddress(BA, MVT::i32, 0, MipsII::MO_ABS_HI);
SDValue BALo =
DAG.getTargetBlockAddress(BA, MVT::i32, 0, MipsII::MO_ABS_LO);
SDValue Hi = DAG.getNode(MipsISD::Hi, dl, MVT::i32, BAHi);
SDValue Lo = DAG.getNode(MipsISD::Lo, dl, MVT::i32, BALo);
return DAG.getNode(ISD::ADD, dl, MVT::i32, Hi, Lo);
}
EVT ValTy = Op.getValueType();
unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT;
unsigned OFSTFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO;
SDValue BAGOTOffset = DAG.getTargetBlockAddress(BA, ValTy, 0, GOTFlag);
BAGOTOffset = DAG.getNode(MipsISD::Wrapper, dl, ValTy,
GetGlobalReg(DAG, ValTy), BAGOTOffset);
SDValue BALOOffset = DAG.getTargetBlockAddress(BA, ValTy, 0, OFSTFlag);
SDValue Load = DAG.getLoad(ValTy, dl, DAG.getEntryNode(), BAGOTOffset,
MachinePointerInfo(), false, false, false, 0);
SDValue Lo = DAG.getNode(MipsISD::Lo, dl, ValTy, BALOOffset);
return DAG.getNode(ISD::ADD, dl, ValTy, Load, Lo);
return getAddrLocal(Op, DAG, HasMips64);
}
SDValue MipsTargetLowering::
@ -1901,41 +1932,15 @@ LowerGlobalTLSAddress(SDValue Op, SelectionDAG &DAG) const
SDValue MipsTargetLowering::
LowerJumpTable(SDValue Op, SelectionDAG &DAG) const
{
SDValue HiPart, JTI, JTILo;
// FIXME there isn't actually debug info here
DebugLoc dl = Op.getDebugLoc();
bool IsPIC = getTargetMachine().getRelocationModel() == Reloc::PIC_;
EVT PtrVT = Op.getValueType();
JumpTableSDNode *JT = cast<JumpTableSDNode>(Op);
if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64)
return getAddrNonPIC(Op, DAG);
if (!IsPIC && !IsN64) {
JTI = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, MipsII::MO_ABS_HI);
HiPart = DAG.getNode(MipsISD::Hi, dl, PtrVT, JTI);
JTILo = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, MipsII::MO_ABS_LO);
} else {// Emit Load from Global Pointer
unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT;
unsigned OfstFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO;
JTI = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, GOTFlag);
JTI = DAG.getNode(MipsISD::Wrapper, dl, PtrVT, GetGlobalReg(DAG, PtrVT),
JTI);
HiPart = DAG.getLoad(PtrVT, dl, DAG.getEntryNode(), JTI,
MachinePointerInfo(), false, false, false, 0);
JTILo = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, OfstFlag);
}
SDValue Lo = DAG.getNode(MipsISD::Lo, dl, PtrVT, JTILo);
return DAG.getNode(ISD::ADD, dl, PtrVT, HiPart, Lo);
return getAddrLocal(Op, DAG, HasMips64);
}
SDValue MipsTargetLowering::
LowerConstantPool(SDValue Op, SelectionDAG &DAG) const
{
SDValue ResNode;
ConstantPoolSDNode *N = cast<ConstantPoolSDNode>(Op);
const Constant *C = N->getConstVal();
// FIXME there isn't actually debug info here
DebugLoc dl = Op.getDebugLoc();
// gp_rel relocation
// FIXME: we should reference the constant pool using small data sections,
// but the asm printer currently doesn't support this feature without
@ -1946,31 +1951,10 @@ LowerConstantPool(SDValue Op, SelectionDAG &DAG) const
// SDValue GOT = DAG.getGLOBAL_OFFSET_TABLE(MVT::i32);
// ResNode = DAG.getNode(ISD::ADD, MVT::i32, GOT, GPRelNode);
if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64) {
SDValue CPHi = DAG.getTargetConstantPool(C, MVT::i32, N->getAlignment(),
N->getOffset(), MipsII::MO_ABS_HI);
SDValue CPLo = DAG.getTargetConstantPool(C, MVT::i32, N->getAlignment(),
N->getOffset(), MipsII::MO_ABS_LO);
SDValue HiPart = DAG.getNode(MipsISD::Hi, dl, MVT::i32, CPHi);
SDValue Lo = DAG.getNode(MipsISD::Lo, dl, MVT::i32, CPLo);
ResNode = DAG.getNode(ISD::ADD, dl, MVT::i32, HiPart, Lo);
} else {
EVT ValTy = Op.getValueType();
unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT;
unsigned OFSTFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO;
SDValue CP = DAG.getTargetConstantPool(C, ValTy, N->getAlignment(),
N->getOffset(), GOTFlag);
CP = DAG.getNode(MipsISD::Wrapper, dl, ValTy, GetGlobalReg(DAG, ValTy), CP);
SDValue Load = DAG.getLoad(ValTy, dl, DAG.getEntryNode(), CP,
MachinePointerInfo::getConstantPool(), false,
false, false, 0);
SDValue CPLo = DAG.getTargetConstantPool(C, ValTy, N->getAlignment(),
N->getOffset(), OFSTFlag);
SDValue Lo = DAG.getNode(MipsISD::Lo, dl, ValTy, CPLo);
ResNode = DAG.getNode(ISD::ADD, dl, ValTy, Load, Lo);
}
if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64)
return getAddrNonPIC(Op, DAG);
return ResNode;
return getAddrLocal(Op, DAG, HasMips64);
}
SDValue MipsTargetLowering::LowerVASTART(SDValue Op, SelectionDAG &DAG) const {
@ -2862,60 +2846,41 @@ MipsTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
// If the callee is a GlobalAddress/ExternalSymbol node (quite common, every
// direct call is) turn it into a TargetGlobalAddress/TargetExternalSymbol
// node so that legalize doesn't hack it.
unsigned char OpFlag;
bool IsPICCall = (IsN64 || IsPIC); // true if calls are translated to jalr $25
bool GlobalOrExternal = false;
SDValue CalleeLo;
if (GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee)) {
if (IsPICCall && G->getGlobal()->hasInternalLinkage()) {
OpFlag = IsO32 ? MipsII::MO_GOT : MipsII::MO_GOT_PAGE;
unsigned char LoFlag = IsO32 ? MipsII::MO_ABS_LO : MipsII::MO_GOT_OFST;
if (IsPICCall) {
if (G->getGlobal()->hasInternalLinkage())
Callee = getAddrLocal(Callee, DAG, HasMips64);
else if (LargeGOT)
Callee = getAddrGlobalLargeGOT(Callee, DAG, MipsII::MO_CALL_HI16,
MipsII::MO_CALL_LO16);
else
Callee = getAddrGlobal(Callee, DAG, MipsII::MO_GOT_CALL);
} else
Callee = DAG.getTargetGlobalAddress(G->getGlobal(), dl, getPointerTy(), 0,
OpFlag);
CalleeLo = DAG.getTargetGlobalAddress(G->getGlobal(), dl, getPointerTy(),
0, LoFlag);
} else {
OpFlag = IsPICCall ? MipsII::MO_GOT_CALL : MipsII::MO_NO_FLAG;
Callee = DAG.getTargetGlobalAddress(G->getGlobal(), dl,
getPointerTy(), 0, OpFlag);
}
MipsII::MO_NO_FLAG);
GlobalOrExternal = true;
}
else if (ExternalSymbolSDNode *S = dyn_cast<ExternalSymbolSDNode>(Callee)) {
if (IsN64 || (!IsO32 && IsPIC))
OpFlag = MipsII::MO_GOT_DISP;
else if (!IsPIC) // !N64 && static
OpFlag = MipsII::MO_NO_FLAG;
if (!IsN64 && !IsPIC) // !N64 && static
Callee = DAG.getTargetExternalSymbol(S->getSymbol(), getPointerTy(),
MipsII::MO_NO_FLAG);
else if (LargeGOT)
Callee = getAddrGlobalLargeGOT(Callee, DAG, MipsII::MO_CALL_HI16,
MipsII::MO_CALL_LO16);
else if (HasMips64)
Callee = getAddrGlobal(Callee, DAG, MipsII::MO_GOT_DISP);
else // O32 & PIC
OpFlag = MipsII::MO_GOT_CALL;
Callee = DAG.getTargetExternalSymbol(S->getSymbol(), getPointerTy(),
OpFlag);
Callee = getAddrGlobal(Callee, DAG, MipsII::MO_GOT_CALL);
GlobalOrExternal = true;
}
SDValue InFlag;
// Create nodes that load address of callee and copy it to T9
if (IsPICCall) {
if (GlobalOrExternal) {
// Load callee address
Callee = DAG.getNode(MipsISD::Wrapper, dl, getPointerTy(),
GetGlobalReg(DAG, getPointerTy()), Callee);
SDValue LoadValue = DAG.getLoad(getPointerTy(), dl, DAG.getEntryNode(),
Callee, MachinePointerInfo::getGOT(),
false, false, false, 0);
// Use GOT+LO if callee has internal linkage.
if (CalleeLo.getNode()) {
SDValue Lo = DAG.getNode(MipsISD::Lo, dl, getPointerTy(), CalleeLo);
Callee = DAG.getNode(ISD::ADD, dl, getPointerTy(), LoadValue, Lo);
} else
Callee = LoadValue;
}
}
// T9 register operand.
SDValue T9;

View File

@ -1154,12 +1154,14 @@ def : MipsPat<(MipsHi tblockaddress:$in), (LUi tblockaddress:$in)>;
def : MipsPat<(MipsHi tjumptable:$in), (LUi tjumptable:$in)>;
def : MipsPat<(MipsHi tconstpool:$in), (LUi tconstpool:$in)>;
def : MipsPat<(MipsHi tglobaltlsaddr:$in), (LUi tglobaltlsaddr:$in)>;
def : MipsPat<(MipsHi texternalsym:$in), (LUi texternalsym:$in)>;
def : MipsPat<(MipsLo tglobaladdr:$in), (ADDiu ZERO, tglobaladdr:$in)>;
def : MipsPat<(MipsLo tblockaddress:$in), (ADDiu ZERO, tblockaddress:$in)>;
def : MipsPat<(MipsLo tjumptable:$in), (ADDiu ZERO, tjumptable:$in)>;
def : MipsPat<(MipsLo tconstpool:$in), (ADDiu ZERO, tconstpool:$in)>;
def : MipsPat<(MipsLo tglobaltlsaddr:$in), (ADDiu ZERO, tglobaltlsaddr:$in)>;
def : MipsPat<(MipsLo texternalsym:$in), (ADDiu ZERO, texternalsym:$in)>;
def : MipsPat<(add CPURegs:$hi, (MipsLo tglobaladdr:$lo)),
(ADDiu CPURegs:$hi, tglobaladdr:$lo)>;

View File

@ -222,10 +222,17 @@ void *MipsJITInfo::emitFunctionStub(const Function *F, void *Fn,
// addiu t9, t9, %lo(EmittedAddr)
// jalr t8, t9
// nop
JCE.emitWordLE(0xf << 26 | 25 << 16 | Hi);
JCE.emitWordLE(9 << 26 | 25 << 21 | 25 << 16 | Lo);
JCE.emitWordLE(25 << 21 | 24 << 11 | 9);
JCE.emitWordLE(0);
if (IsLittleEndian) {
JCE.emitWordLE(0xf << 26 | 25 << 16 | Hi);
JCE.emitWordLE(9 << 26 | 25 << 21 | 25 << 16 | Lo);
JCE.emitWordLE(25 << 21 | 24 << 11 | 9);
JCE.emitWordLE(0);
} else {
JCE.emitWordBE(0xf << 26 | 25 << 16 | Hi);
JCE.emitWordBE(9 << 26 | 25 << 21 | 25 << 16 | Lo);
JCE.emitWordBE(25 << 21 | 24 << 11 | 9);
JCE.emitWordBE(0);
}
sys::Memory::InvalidateInstructionCache(Addr, 16);
if (!sys::Memory::setRangeExecutable(Addr, 16))

View File

@ -26,10 +26,11 @@ class MipsTargetMachine;
class MipsJITInfo : public TargetJITInfo {
bool IsPIC;
bool IsLittleEndian;
public:
explicit MipsJITInfo() :
IsPIC(false) {}
IsPIC(false), IsLittleEndian(true) {}
/// replaceMachineCodeForFunction - Make it so that calling the function
/// whose machine code is at OLD turns into a call to NEW, perhaps by
@ -58,8 +59,10 @@ class MipsJITInfo : public TargetJITInfo {
unsigned NumRelocs, unsigned char *GOTBase);
/// Initialize - Initialize internal stage for the function being JITted.
void Initialize(const MachineFunction &MF, bool isPIC) {
void Initialize(const MachineFunction &MF, bool isPIC,
bool isLittleEndian) {
IsPIC = isPIC;
IsLittleEndian = isLittleEndian;
}
};

View File

@ -62,6 +62,10 @@ MCOperand MipsMCInstLower::LowerSymbolOperand(const MachineOperand &MO,
case MipsII::MO_GOT_OFST: Kind = MCSymbolRefExpr::VK_Mips_GOT_OFST; break;
case MipsII::MO_HIGHER: Kind = MCSymbolRefExpr::VK_Mips_HIGHER; break;
case MipsII::MO_HIGHEST: Kind = MCSymbolRefExpr::VK_Mips_HIGHEST; break;
case MipsII::MO_GOT_HI16: Kind = MCSymbolRefExpr::VK_Mips_GOT_HI16; break;
case MipsII::MO_GOT_LO16: Kind = MCSymbolRefExpr::VK_Mips_GOT_LO16; break;
case MipsII::MO_CALL_HI16: Kind = MCSymbolRefExpr::VK_Mips_CALL_HI16; break;
case MipsII::MO_CALL_LO16: Kind = MCSymbolRefExpr::VK_Mips_CALL_LO16; break;
}
switch (MOTy) {

View File

@ -2160,6 +2160,9 @@ static bool isIntegerWideningViable(const DataLayout &TD,
AllocaPartitioning::const_use_iterator I,
AllocaPartitioning::const_use_iterator E) {
uint64_t SizeInBits = TD.getTypeSizeInBits(AllocaTy);
// Don't create integer types larger than the maximum bitwidth.
if (SizeInBits > IntegerType::MAX_INT_BITS)
return false;
// Don't try to handle allocas with bit-padding.
if (SizeInBits != TD.getTypeStoreSizeInBits(AllocaTy))
@ -2198,7 +2201,7 @@ static bool isIntegerWideningViable(const DataLayout &TD,
if (RelBegin == 0 && RelEnd == Size)
WholeAllocaOp = true;
if (IntegerType *ITy = dyn_cast<IntegerType>(LI->getType())) {
if (ITy->getBitWidth() < TD.getTypeStoreSize(ITy))
if (ITy->getBitWidth() < TD.getTypeStoreSizeInBits(ITy))
return false;
continue;
}
@ -2214,7 +2217,7 @@ static bool isIntegerWideningViable(const DataLayout &TD,
if (RelBegin == 0 && RelEnd == Size)
WholeAllocaOp = true;
if (IntegerType *ITy = dyn_cast<IntegerType>(ValueTy)) {
if (ITy->getBitWidth() < TD.getTypeStoreSize(ITy))
if (ITy->getBitWidth() < TD.getTypeStoreSizeInBits(ITy))
return false;
continue;
}

View File

@ -0,0 +1,50 @@
; RUN: llc -march=mipsel -mxgot < %s | FileCheck %s -check-prefix=O32
; RUN: llc -march=mips64el -mcpu=mips64r2 -mattr=+n64 -mxgot < %s | \
; RUN: FileCheck %s -check-prefix=N64
@v0 = external global i32
define void @foo1() nounwind {
entry:
; O32: lui $[[R0:[0-9]+]], %got_hi(v0)
; O32: addu $[[R1:[0-9]+]], $[[R0]], ${{[a-z0-9]+}}
; O32: lw ${{[0-9]+}}, %got_lo(v0)($[[R1]])
; O32: lui $[[R2:[0-9]+]], %call_hi(foo0)
; O32: addu $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}}
; O32: lw ${{[0-9]+}}, %call_lo(foo0)($[[R3]])
; N64: lui $[[R0:[0-9]+]], %got_hi(v0)
; N64: daddu $[[R1:[0-9]+]], $[[R0]], ${{[a-z0-9]+}}
; N64: ld ${{[0-9]+}}, %got_lo(v0)($[[R1]])
; N64: lui $[[R2:[0-9]+]], %call_hi(foo0)
; N64: daddu $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}}
; N64: ld ${{[0-9]+}}, %call_lo(foo0)($[[R3]])
%0 = load i32* @v0, align 4
tail call void @foo0(i32 %0) nounwind
ret void
}
declare void @foo0(i32)
; call to external function.
define void @foo2(i32* nocapture %d, i32* nocapture %s, i32 %n) nounwind {
entry:
; O32: foo2:
; O32: lui $[[R2:[0-9]+]], %call_hi(memcpy)
; O32: addu $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}}
; O32: lw ${{[0-9]+}}, %call_lo(memcpy)($[[R3]])
; N64: foo2:
; N64: lui $[[R2:[0-9]+]], %call_hi(memcpy)
; N64: daddu $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}}
; N64: ld ${{[0-9]+}}, %call_lo(memcpy)($[[R3]])
%0 = bitcast i32* %d to i8*
%1 = bitcast i32* %s to i8*
tail call void @llvm.memcpy.p0i8.p0i8.i32(i8* %0, i8* %1, i32 %n, i32 4, i1 false)
ret void
}
declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture, i32, i32, i1) nounwind

42
test/MC/Mips/xgot.ll Normal file
View File

@ -0,0 +1,42 @@
; RUN: llc -filetype=obj -mtriple mipsel-unknown-linux -mxgot %s -o - | elf-dump --dump-section-data | FileCheck %s
@.str = private unnamed_addr constant [16 x i8] c"ext_1=%d, i=%d\0A\00", align 1
@ext_1 = external global i32
define void @fill() nounwind {
entry:
; Check that the appropriate relocations were created.
; For the xgot case we want to see R_MIPS_[GOT|CALL]_[HI|LO]16.
; R_MIPS_HI16
; CHECK: ('r_type', 0x05)
; R_MIPS_LO16
; CHECK: ('r_type', 0x06)
; R_MIPS_GOT_HI16
; CHECK: ('r_type', 0x16)
; R_MIPS_GOT_LO16
; CHECK: ('r_type', 0x17)
; R_MIPS_GOT
; CHECK: ('r_type', 0x09)
; R_MIPS_LO16
; CHECK: ('r_type', 0x06)
; R_MIPS_CALL_HI16
; CHECK: ('r_type', 0x1e)
; R_MIPS_CALL_LO16
; CHECK: ('r_type', 0x1f)
%0 = load i32* @ext_1, align 4
%call = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([16 x i8]* @.str, i32 0, i32 0), i32 %0) nounwind
ret void
}
declare i32 @printf(i8* nocapture, ...) nounwind

View File

@ -1134,3 +1134,45 @@ entry:
ret void
; CHECK: ret
}
define void @PR14465() {
; Ensure that we don't crash when analyzing a alloca larger than the maximum
; integer type width (MAX_INT_BITS) supported by llvm (1048576*32 > (1<<23)-1).
; CHECK: @PR14465
%stack = alloca [1048576 x i32], align 16
; CHECK: alloca [1048576 x i32]
%cast = bitcast [1048576 x i32]* %stack to i8*
call void @llvm.memset.p0i8.i64(i8* %cast, i8 -2, i64 4194304, i32 16, i1 false)
ret void
; CHECK: ret
}
define void @PR14548(i1 %x) {
; Handle a mixture of i1 and i8 loads and stores to allocas. This particular
; pattern caused crashes and invalid output in the PR, and its nature will
; trigger a mixture in several permutations as we resolve each alloca
; iteratively.
; Note that we don't do a particularly good *job* of handling these mixtures,
; but the hope is that this is very rare.
; CHECK: @PR14548
entry:
%a = alloca <{ i1 }>, align 8
%b = alloca <{ i1 }>, align 8
; Nothing of interest is simplified here.
; CHECK: alloca
; CHECK: alloca
%b.i1 = bitcast <{ i1 }>* %b to i1*
store i1 %x, i1* %b.i1, align 8
%b.i8 = bitcast <{ i1 }>* %b to i8*
%foo = load i8* %b.i8, align 1
%a.i8 = bitcast <{ i1 }>* %a to i8*
call void @llvm.memcpy.p0i8.p0i8.i32(i8* %a.i8, i8* %b.i8, i32 1, i32 1, i1 false) nounwind
%bar = load i8* %a.i8, align 1
%a.i1 = getelementptr inbounds <{ i1 }>* %a, i32 0, i32 0
%baz = load i1* %a.i1, align 1
ret void
}

View File

@ -82,14 +82,9 @@ entry:
%a0i16ptr = bitcast i8* %a0ptr to i16*
store i16 1, i16* %a0i16ptr
; CHECK: %[[mask0:.*]] = and i16 1, -16
%a1i4ptr = bitcast i8* %a1ptr to i4*
store i4 1, i4* %a1i4ptr
; CHECK-NEXT: %[[insert0:.*]] = or i16 %[[mask0]], 1
store i8 1, i8* %a2ptr
; CHECK-NEXT: %[[mask1:.*]] = and i40 undef, 4294967295
; CHECK: %[[mask1:.*]] = and i40 undef, 4294967295
; CHECK-NEXT: %[[insert1:.*]] = or i40 %[[mask1]], 4294967296
%a3i24ptr = bitcast i8* %a3ptr to i24*
@ -110,7 +105,7 @@ entry:
%ai = load i56* %aiptr
%ret = zext i56 %ai to i64
ret i64 %ret
; CHECK-NEXT: %[[ext4:.*]] = zext i16 %[[insert0]] to i56
; CHECK-NEXT: %[[ext4:.*]] = zext i16 1 to i56
; CHECK-NEXT: %[[shift4:.*]] = shl i56 %[[ext4]], 40
; CHECK-NEXT: %[[mask4:.*]] = and i56 %[[insert3]], 1099511627775
; CHECK-NEXT: %[[insert4:.*]] = or i56 %[[mask4]], %[[shift4]]