Vendor import of llvm tags/RELEASE_32/final r170710 (effectively, 3.2

release): http://llvm.org/svn/llvm-project/llvm/tags/RELEASE_32/final@170710
2012-12-22 14:58:30 +00:00 · 2012-12-22 14:58:30 +00:00 · 482e7bddf6
commit 482e7bddf6
parent 522600a229
24 changed files with 563 additions and 353 deletions
--- a/docs/ReleaseNotes.html
+++ b/docs/ReleaseNotes.html
@ -29,12 +29,6 @@
  <p>Written by the <a href="http://llvm.org/">LLVM Team</a></p>
 </div>

-<h1 style="color:red">These are in-progress notes for the upcoming LLVM 3.2
-release.<br>
-You may prefer the
-<a href="http://llvm.org/releases/3.1/docs/ReleaseNotes.html">LLVM 3.1
-Release Notes</a>.</h1>
-
 <!-- *********************************************************************** -->
 <h2>
  <a name="intro">Introduction</a>
@ -46,7 +40,7 @@ Release Notes</a>.</h1>
 <p>This document contains the release notes for the LLVM Compiler
   Infrastructure, release 3.2.  Here we describe the status of LLVM, including
   major improvements from the previous release, improvements in various
-   subprojects of LLVM, and some of the current users of the code.  All LLVM
+   sub-projects of LLVM, and some of the current users of the code.  All LLVM
   releases may be downloaded from the <a href="http://llvm.org/releases/">LLVM
   releases web site</a>.</p>

@ -72,11 +66,12 @@ Release Notes</a>.</h1>

 <div>

-<p>The LLVM 3.2 distribution currently consists of code from the core LLVM
-   repository, which roughly includes the LLVM optimizers, code generators and
-   supporting tools, and the Clang repository. In addition to this code, the
-   LLVM Project includes other sub-projects that are in development. Here we
-   include updates on these subprojects.</p>
+<p>The LLVM 3.2 distribution currently consists of production-quality code
+   from the core LLVM repository, which roughly includes the LLVM optimizers,
+   code generators and supporting tools, as well as Clang, DragonEgg and 
+   compiler-rt sub-project repositories. In addition to this code, the LLVM 
+   Project includes other sub-projects that are in development. Here we 
+   include updates on these sub-projects.</p>

 <!--=========================================================================-->
 <h3>
@ -90,18 +85,18 @@ Release Notes</a>.</h1>
   experience through expressive diagnostics, a high level of conformance to
   language standards, fast compilation, and low memory use. Like LLVM, Clang
   provides a modular, library-based architecture that makes it suitable for
-   creating or integrating with other development tools. Clang is considered a
-   production-quality compiler for C, Objective-C, C++ and Objective-C++ on x86
-   (32- and 64-bit), and for Darwin/ARM targets.</p>
+   creating or integrating with other development tools.</p>

 <p>In the LLVM 3.2 time-frame, the Clang team has made many improvements.
   Highlights include:</p>
 <ul>
-  <li>...</li>
+  <li>Improvements to Clang's diagnostics</li>
+  <li>Support for tls_model attribute</li>
+  <li>Type safety attributes</li>
 </ul>

 <p>For more details about the changes to Clang since the 3.1 release, see the
-   <a href="http://clang.llvm.org/docs/ReleaseNotes.html">Clang release
+   <a href="http://llvm.org/releases/3.2/tools/clang/docs/ReleaseNotes.html">Clang 3.2 release
   notes.</a></p>

 <p>If Clang rejects your code but another compiler accepts it, please take a
@ -129,7 +124,10 @@ Release Notes</a>.</h1>
 <p>The 3.2 release has the following notable changes:</p>

 <ul>
-  <li>...</li>
+ <li>Able to load LLVM plugins such as Polly.</li>
+ <li>Supports thread-local storage models.</li>
+ <li>Passes knowledge of variable lifetimes to the LLVM optimizers.</li>
+ <li>No longer requires GCC to be built with LTO support.</li>
 </ul>

 </div>
@ -141,7 +139,8 @@ Release Notes</a>.</h1>

 <div>

-<p>The new LLVM <a href="http://compiler-rt.llvm.org/">compiler-rt project</a>
+
+<p>The LLVM <a href="http://compiler-rt.llvm.org/">compiler-rt project</a>
   is a simple library that provides an implementation of the low-level
   target-specific hooks required by code generation and other runtime
   components.  For example, when compiling for a 32-bit target, converting a
@ -153,7 +152,11 @@ Release Notes</a>.</h1>
 <p>The 3.2 release has the following notable changes:</p>

 <ul>
-  <li>...</li>
+  <li><a href="http://llvm.org/releases/3.2/tools/clang/docs/ThreadSanitizer.html">ThreadSanitizer (TSan)</a> - data race detector run-time library for C/C++ has been added.</li>
+  <li>Improvements to <a href="http://llvm.org/releases/3.2/tools/clang/docs/AddressSanitizer.html">AddressSanitizer</a> including: better portability 
+  (OSX, Android NDK), support for cmake based builds, enhanced error reporting and lots of bug fixes.</li>
+  <li>Added support for A6 'Swift' CPU.</li>
+  <li><code>divsi3</code> function has been enhanced to take advantage of a hardware unsigned divide when it is available.</li>
 </ul>

 </div>
@ -174,7 +177,9 @@ Release Notes</a>.</h1>
 <p>The 3.2 release has the following notable changes:</p>

 <ul>
-  <li>...</li>
+  <li>Linux build fixes for clang (see <a href="http://lldb.llvm.org/build.html">Building LLDB</a>)</li>
+  <li>Some Linux stability and usability improvements</li>
+  <li>Switch expression evaluation to use MCJIT (from legacy JIT) on Linux</li>
 </ul>

 </div>
@ -193,7 +198,15 @@ Release Notes</a>.</h1>
 <p>Within the LLVM 3.2 time-frame there were the following highlights:</p>

 <ul>
-  <li>...</li>
+  <li> C++11 shared_ptr atomic access API (20.7.2.5) has been implemented.</li>
+  <li>Applied noexcept and constexpr throughout library.</li>
+  <li>Improved C++11 conformance in associative container emplace.</li>
+  <li>Performance improvements in: std::rotate algorithm and I/O.</li>
+  <li>Operator new/delete and type_infos for exception types moved from libc++ to libc++abi.</li>
+  <li>Bug fixes in: <code>&lt;atomic&gt;</code>; vector<code>&lt;bool&gt;</code> algorithms,
+    <code>&lt;future&gt;</code>,<code>&lt;tuple&gt;</code>,
+    <code>&lt;type_traits&gt;</code>,<code>&lt;fstream&gt;</code>,<code>&lt;istream&gt;</code>,
+    <code>&lt;iterator&gt;</code>, <code>&lt;condition_variable&gt;</code>,<code>&lt;complex&gt;</code> as well as visibility fixes.
 </ul>

 </div>
@ -212,7 +225,7 @@ Release Notes</a>.</h1>
 <p>The 3.2 release has the following notable changes:</p>

 <ul>
-  <li>...</li>
+  <li>Bug fixes only, no functional changes.</li>
 </ul>

 </div>
@ -227,18 +240,63 @@ Release Notes</a>.</h1>

 <p><a href="http://polly.llvm.org/">Polly</a> is an <em>experimental</em>
  optimizer for data locality and parallelism. It currently provides high-level
-  loop optimizations and automatic parallelisation (using the OpenMP run time).
+  loop optimizations and automatic parallelization (using the OpenMP run time).
  Work in the area of automatic SIMD and accelerator code generation was
  started.</p>

 <p>Within the LLVM 3.2 time-frame there were the following highlights:</p>

 <ul>
-  <li>...</li>
+  <li>isl, the integer set library used by Polly, was relicensed under the MIT license.</li>
+  <li>isl based code generation.</li>
+  <li>MIT licensed replacement for CLooG (LGPLv2).</li>
+  <li>Fine grained option handling (separation of core and border computations, control overhead vs. code size).</li>
+  <li>Support for FORTRAN and Dragonegg.</li>
+  <li>OpenMP code generation fixes.</li>
 </ul>

 </div>

+<!--=========================================================================-->
+<h3>
+<a name="StaticAnalyzer">Clang Static Analyzer</a>
+</h3>
+
+<div>
+
+<p>The <a href="http://clang-analyzer.llvm.org/">Clang Static Analyzer</a> 
+    is an advanced source code analysis tool integrated into Clang that performs
+    a deep analysis of code to find potential bugs.</p>
+    
+<p>In the LLVM 3.2 release, the static analyzer has made significant improvements
+    in many areas, with notable highlights such as:</p>
+    
+<ul>
+    <li>Improved interprocedural analysis within a translation unit (see details below), which greatly amplified the analyzer's ability to find bugs.</li>
+    <li>New infrastructure to model &quot;well-known&quot; APIs, allowing the analyzer to do a much better job when modeling calls to such functions.</li>
+    <li>Significant improvements to the APIs to write static analyzer checkers, with a more unified way of representing function/method calls in the checker API.  Details can be found in the <a href="http://llvm.org/devmtg/2012-11#talk13">Building a Checker in 24 hours</a> talk.
+</ul>
+
+<p>The release specifically includes notable improvements for Objective-C analysis, including:</p>
+
+<ul>
+    <li>Interprocedural analysis for Objective-C methods.</li>
+    <li>Interprocedural analysis of calls to &quot;blocks&quot;.</li>
+    <li>Precise modeling of GCD APIs such as <tt>dispatch_once</tt> and friends.</li>
+    <li>Improved support for recently added Objective-C constructs such as array and dictionary literals.</li>
+</ul>
+
+<p>The release specifically includes notable improvements for C++ analysis, including:</p>
+
+<ul>
+    <li>Interprocedural analysis for C++ methods (within a translation unit).</li>
+    <li>More precise modeling of C++ initializers and destructors.</li>
+</ul>
+
+<p>Finally, this release includes many small improvements to <tt>scan-build</tt>, which can be used to drive the analyzer from the command line or a continuous integration system.  This includes a directory-traversal issue, which could cause potential security problems in some cases.  We would like to acknowledge Tim Brown of Portcullis Computer Security Ltd for reporting this issue.</p>
+    
+</div>
+
 </div>

 <!-- *********************************************************************** -->
@ -265,6 +323,19 @@ Release Notes</a>.</h1>

 </div>

+<h3>EmbToolkit</h3>
+
+<div>
+
+<p><a href="http://www.embtoolkit.org/">EmbToolkit</a> provides Linux cross-compiler 
+    toolchain/SDK (GCC/binutils/C library (uclibc,eglibc,musl)), a build system for 
+    package cross-compilation and optionally various root file systems. 
+    It supports ARM and MIPS. There is an ongoing effort to provide a clang+llvm 
+    environment for the 3.2 releases, 
+</p>
+
+</div>
+
 <h3>FAUST</h3>

 <div>
@ -274,7 +345,7 @@ Release Notes</a>.</h1>
   AUdio STream. Its programming model combines two approaches: functional
   programming and block diagram composition. In addition with the C, C++, Java,
   JavaScript output formats, the Faust compiler can generate LLVM bitcode, and
-   works with LLVM 2.7-3.1.</p>
+   works with LLVM 2.7-3.2.</p>

 </div>

@ -331,7 +402,11 @@ Release Notes</a>.</h1>

 <p>OSL was developed by Sony Pictures Imageworks for use in its in-house
   renderer used for feature film animation and visual effects, and is
-   distributed as open source software with the "New BSD" license.</p>
+   distributed as open source software with the "New BSD" license.
+   It has been used for all the shading on such films as The Amazing Spider-Man,
+   Men in Black III, Hotel Transylvania, and may other films in-progress, 
+   and also has been incorporated into several commercial and open source 
+   rendering products such as Blender, VRay, and Autodesk Beast.</p>

 </div>

@ -367,7 +442,7 @@ Release Notes</a>.</h1>
   C++, Fortran and Faust code in Pure programs if the corresponding
   LLVM-enabled compilers are installed).</p>

-<p>Pure version 0.54 has been tested and is known to work with LLVM 3.1 (and
+<p>Pure version 0.56 has been tested and is known to work with LLVM 3.2 (and
   continues to work with older LLVM releases >= 2.5).</p>

 </div>
@ -432,7 +507,9 @@ Release Notes</a>.</h1>
 <p>LLVM 3.2 includes several major changes and big features:</p>

 <ul>
-  <li>...</li>
+  <li>Loop Vectorizer.</li>
+  <li>New implementation of SROA.</li>
+  <li>New NVPTX back-end (replacing existing PTX back-end) based on NVIDIA sources.</li>
 </ul>

 </div>
@ -451,7 +528,10 @@ Release Notes</a>.</h1>
 <ul>
  <li>Thread local variables may have a specified TLS model. See the
  <a href="LangRef.html#globalvars">Language Reference Manual</a>.</li>
-  <li>...</li>
+  <li>'TYPE_CODE_FUNCTION_OLD' type code and autoupgrade code for old function attributes format has been removed.</li>
+  <li>Internal representation of the Attributes class has been converted into a pointer to an
+         opaque object that's uniqued by and stored in the LLVMContext object. 
+         The Attributes class then becomes a thin wrapper around this opaque object.</li>
 </ul>

 </div>
@ -489,23 +569,33 @@ Release Notes</a>.</h1>
    <ul>
    <li>The inner most loops must have a single basic block.</li>
    <li>The number of iterations are known before the loop starts to execute.</li>
-    <li>The loop counter needs to be incrimented by one.</li>
+    <li>The loop counter needs to be incremented by one.</li>
    <li>The loop trip count <b>can</b> be a variable.</li>
    <li>Loops do <b>not</b> need to start at zero.</li>
    <li>The induction variable can be used inside the loop.</li>
    <li>Loop reductions are supported.</li>
    <li>Arrays with affine access pattern do <b>not</b> need to be marked as 'noalias' and are checked at runtime.</li>
-    <li>...</li>
    </ul>

 </p>

-<p>SROA - We've re-written SROA to be significantly more powerful.
-<!-- FIXME: Add more text here... --></p>
+<p>SROA - We&#8217;ve re-written SROA to be significantly more powerful and generate
+code which is much more friendly to the rest of the optimization pipeline.
+Previously this pass had scaling problems that required it to only operate on
+relatively small aggregates, and at times it would mistakenly replace a large
+aggregate with a single very large integer in order to make it a scalar SSA
+value. The result was a large number of i1024 and i2048 values representing any
+small stack buffer. These in turn slowed down many subsequent optimization
+paths.</p>
+<p>The new SROA pass uses a different algorithm that allows it to only promote to
+scalars the pieces of the aggregate actively in use. Because of this it doesn&#8217;t
+require any thresholds. It also always deduces the scalar values from the uses
+of the aggregate rather than the specific LLVM type of the aggregate. These
+features combine to both optimize more code with the pass but to improve the
+compile time of many functions dramatically.</p>

 <ul>
-  <li>Branch weight metadata is preseved through more of the optimizer.</li>
-  <li>...</li>
+  <li>Branch weight metadata is preserved through more of the optimizer.</li>
 </ul>

 </div>
@ -524,8 +614,19 @@ Release Notes</a>.</h1>
   <a href="http://blog.llvm.org/2010/04/intro-to-llvm-mc-project.html">Intro
   to the LLVM MC Project Blog Post</a>.</p>

-<ul>
-  <li>...</li>
+<ul>    
+  <li> Added support for following assembler directives: <code>.ifb</code>, <code>.ifnb</code>, <code>.ifc</code>, 
+          <code>.ifnc</code>, <code>.purgem</code>, <code>.rept</code> and <code>.version</code> (ELF) as well as Darwin specific
+	<code>.pushsection</code>, <code>.popsection</code> and  <code>.previous</code> .</li>
+  <li>Enhanced handling of <code>.lcomm directive</code>.</li>
+  <li>MS style inline assembler: added implementation of the offset and TYPE operators.</li>
+  <li>Targets can specify minimum supported NOP size for NOP padding.</li>
+  <li>ELF improvements: added support for generating ELF objects on Windows.</li>
+  <li>MachO improvements:  symbol-difference variables are marked as N_ABS, added direct-to-object attribute for data-in-code markers.</li>
+  <li>Added support for annotated disassembly output for x86 and arm targets.</li>
+  <li>Arm support has been improved by adding support for ARM TARGET2 relocation
+          and fixing hadling of ARM-style "$d.*" labels.</li>
+   <li>Implemented local-exec TLS on PowerPC.</li>
 </ul>

 </div>
@ -550,10 +651,6 @@ Release Notes</a>.</h1>
   infrastructure, which allows us to implement more aggressive algorithms and
   make it run faster:</p>

-<ul>
-  <li>...</li>
-</ul>
-
 <p> We added new TableGen infrastructure to support bundling for
    Very Long Instruction Word (VLIW) architectures. TableGen can now
    automatically generate a deterministic finite automaton from a VLIW
@ -563,6 +660,13 @@ Release Notes</a>.</h1>
 <p> We have added a new target independent VLIW packetizer based on the
    DFA infrastructure to group machine instructions into bundles.</p>

+<p> We have added new TableGen infrastructure to support relationship maps
+    between instructions. This feature enables TableGen to automatically
+    construct a set of relation tables and query functions that can be used
+    to switch between various forms of instructions. For more information,
+    please refer to <a href="http://llvm.org/docs/HowToUseInstrMappings.html">
+    How To Use Instruction Mappings</a>.</p> 
+
 </div>

 <h4>
@ -588,7 +692,7 @@ Release Notes</a>.</h1>
 <p>New features and major changes in the X86 target include:</p>

 <ul>
-  <li>...</li>
+  <li>Small codegen optimizations, especially for AVX2.</li>
 </ul>

 </div>
@ -603,7 +707,7 @@ Release Notes</a>.</h1>
 <p>New features of the ARM target include:</p>

 <ul>
-  <li>...</li>
+  <li>Support and performance tuning for the A6 'Swift' CPU.</li>
 </ul>

 <!--_________________________________________________________________________-->
@ -620,7 +724,7 @@ Release Notes</a>.</h1>
   platform specific support for Linux.</p>

 <p>Full support is included for Thumb1, Thumb2 and ARM modes, along with
-   subtarget and CPU specific extensions for VFP2, VFP3 and NEON.</p>
+   sub-target and CPU specific extensions for VFP2, VFP3 and NEON.</p>

 <p>The assembler is Unified Syntax only (see ARM Architecural Reference Manual
   for details). While there is some, and growing, support for pre-unfied
@ -640,7 +744,29 @@ Release Notes</a>.</h1>
 <p>New features and major changes in the MIPS target include:</p>

 <ul>
-  <li>...</li>
+  <li>Integrated assembler support: 
+         MIPS32 works for both PIC and static, known limitation is the PR14456 where 
+         R_MIPS_GPREL16 relocation is generated with the wrong addend.
+         MIPS64 support is incomplete, for example exception handling is not working.</li>
+   <li>Support for fast calling convention has been added.</li>
+   <li>Support for Android MIPS toolchain has been added to clang driver.</li>
+   <li>Added clang driver support for MIPS N32 ABI through "-mabi=n32" option.</li>
+   <li>MIPS32 and MIPS64 disassembler has been implemented.</li>
+   <li>Support for compiling programs with large GOTs (exceeding 64kB in size) has been added 
+	through llc option "-mxgot".</li>
+  <li>Added experimental support for MIPS32 DSP intrinsics.</li>
+  <li>Experimental support for MIPS16 with following limitations: only soft float is supported,
+         C++ exceptions are not supported, large stack frames (> 32000 bytes) are not supported,
+         direct object code emission is not supported only .s .</li>
+  <li>Standalone assembler (llvm-mc):  implementation is in progress and considered experimental.</li>
+  <li>All classic JIT and MCJIT tests pass on Little and Big Endian MIPS32 platforms.</li>
+  <li>Inline asm support: all common constraints and operand modifiers have been implemented.</li>
+  <li>Added tail call optimization support, use llc option "-enable-mips-tail-calls"
+      or clang options "-mllvm -enable-mips-tail-calls"to enable it.</li>
+  <li>Improved register allocation by removing registers $fp, $gp, $ra and $at from the list of reserved registers.</li>
+  <li>Long branch expansion pass has been implemented, which expands branch
+      instructions with offsets that do not fit in the 16-bit field.</li>
+  <li>Cavium Octeon II board is used for testing builds (llvm-mips-linux builder).</li>
 </ul>

 </div>
@ -652,7 +778,6 @@ Release Notes</a>.</h1>

 <div>

-<ul>
 <p>Many fixes and changes across LLVM (and Clang) for better compliance with
   the 64-bit PowerPC ELF Application Binary Interface, interoperability with
   GCC, and overall 64-bit PowerPC support.   Some highlights include:</p>
@ -681,8 +806,28 @@ Release Notes</a>.</h1>
 <p>There have also been code generation improvements for both 32- and 64-bit
   code. Instruction scheduling support for the Freescale e500mc and e5500
   cores has been added.</p>
+
+</div>
+
+<!--=========================================================================-->
+<h3>
+<a name="NVPTX">PTX/NVPTX Target Improvements</a>
+</h3>
+
+<div>
+
+<p>The PTX back-end has been replaced by the NVPTX back-end, which is based on
+   the LLVM back-end used by NVIDIA in their CUDA (nvcc) and OpenCL compiler.
+   Some highlights include:</p>
+<ul>
+  <li>Compatibility with PTX 3.1 and SM 3.5</li>
+  <li>Support for NVVM intrinsics as defined in the NVIDIA Compiler SDK</li>
+  <li>Full compatibility with old PTX back-end, with much greater coverage of
+      LLVM IR</li>
 </ul>

+<p>Please submit any back-end bugs to the LLVM Bugzilla site.</p>
+
 </div>

 <!--=========================================================================-->
@ -693,7 +838,7 @@ Release Notes</a>.</h1>
 <div>

 <ul>
-  <li>...</li>
+  <li>Added support for custom names for library functions in TargetLibraryInfo.</li>
 </ul>

 </div>
@ -710,9 +855,11 @@ Release Notes</a>.</h1>
   from the previous release.</p>

 <ul>
-  <li>...</li>
-</ul>
-
+<li>llvm-ld and llvm-stub have been removed, llvm-ld functionality can be partially replaced by 
+        llvm-link | opt | {llc | as, llc -filetype=obj} | ld, or fully replaced by Clang. </li>
+<li>MCJIT: added support for inline assembly (requires asm parser), added faux remote target execution to lli option '-remote-mcjit'.</li>
+</ul> 
+ 
 </div>

 <!--=========================================================================-->
@ -733,10 +880,6 @@ Release Notes</a>.</h1>
 <p> The TargetData structure has been renamed to DataLayout and moved to VMCore
 to remove a dependency on Target. </p>

-<ul>
-  <li>...</li>
-</ul>
-
 </div>

 <!--=========================================================================-->
@ -746,34 +889,23 @@ to remove a dependency on Target. </p>

 <div>

-<p>In addition, some tools have changed in this release. Some of the changes
-   are:</p>
+<p>In addition, some tools have changed in this release. Some of the changes are:</p>

 <ul>
-  <li>...</li>
+<li>opt: added support for '-mtriple' option.</li>
+<li>llvm-mc : - added '-disassemble' support for '-show-inst' and '-show-encoding' options, added '-edis' option to produce annotated 
+        disassembly output for X86 and ARM targets.</li>
+<li>libprofile: allows the profile data file name to be specified by the LLVMPROF_OUTPUT environment variable.</li>
+<li>llvm-objdump: has been changed to display available targets, '-arch' option accepts x86 and x86-64 as valid arch names.</li>
+<li>llc and opt: added FMA formation from pairs of FADD + FMUL or FSUB + FMUL enabled by option '-enable-excess-fp-precision' or option '-enable-unsafe-fp-math',
+       option '-fp-contract' controls the creation by optimizations of fused FP by selecting Fast, Standard, or Strict mode.</li>
+<li>llc: object file output from llc is no longer considered experimental.</li>
+<li>gold plugin: handles Position Independent Executables.</li>
 </ul>

 </div>


-<!--=========================================================================-->
-<h3>
-<a name="python">Python Bindings</a>
-</h3>
-
-<div>
-
-<p>Officially supported Python bindings have been added! Feature support is far
-   from complete. The current bindings support interfaces to:</p>
-
-<ul>
-  <li>...</li>
-</ul>
-
-</div>
-
-</div>
-
 <!-- *********************************************************************** -->
 <h2>
  <a name="knownproblems">Known Problems</a>
@ -794,7 +926,7 @@ to remove a dependency on Target. </p>
  <p>Known problem areas include:</p>

 <ul>
-  <li>The CellSPU, MSP430, PTX and XCore backends are experimental.</li>
+  <li>The CellSPU, MSP430, and XCore backends are experimental, and the CellSPU backend will be removed in LLVM 3.3.</li>

  <li>The integrated assembler, disassembler, and JIT is not supported by
      several targets. If an integrated assembler is not supported, then a
@ -836,7 +968,7 @@ to remove a dependency on Target. </p>
  src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a>

  <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br>
-  Last modified: $Date: 2012-11-20 05:22:44 +0100 (Tue, 20 Nov 2012) $
+  Last modified: $Date: 2012-12-19 11:50:28 +0100 (Wed, 19 Dec 2012) $
 </address>

 </body>
--- a/include/llvm/MC/MCExpr.h
+++ b/include/llvm/MC/MCExpr.h
@ -197,7 +197,11 @@ public:
    VK_Mips_GOT_PAGE,
    VK_Mips_GOT_OFST,
    VK_Mips_HIGHER,
-    VK_Mips_HIGHEST
+    VK_Mips_HIGHEST,
+    VK_Mips_GOT_HI16,
+    VK_Mips_GOT_LO16,
+    VK_Mips_CALL_HI16,
+    VK_Mips_CALL_LO16
  };

 private:
--- a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
+++ b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp
@ -346,7 +346,7 @@ uint8_t *RuntimeDyldImpl::createStubFunction(uint8_t *Addr) {
    uint32_t *StubAddr = (uint32_t*)Addr;
    *StubAddr = 0xe51ff004; // ldr pc,<label>
    return (uint8_t*)++StubAddr;
-  } else if (Arch == Triple::mipsel) {
+  } else if (Arch == Triple::mipsel || Arch == Triple::mips) {
    uint32_t *StubAddr = (uint32_t*)Addr;
    // 0:   3c190000        lui     t9,%hi(addr).
    // 4:   27390000        addiu   t9,t9,%lo(addr).
--- a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
+++ b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
@ -676,7 +676,8 @@ void RuntimeDyldELF::processRelocationRef(const ObjRelocationInfo &Rel,
                        RelType, 0);
      Section.StubOffset += getMaxStubSize();
    }
-  } else if (Arch == Triple::mipsel && RelType == ELF::R_MIPS_26) {
+  } else if ((Arch == Triple::mipsel || Arch == Triple::mips) &&
+             RelType == ELF::R_MIPS_26) {
    // This is an Mips branch relocation, need to use a stub function.
    DEBUG(dbgs() << "\t\tThis is a Mips branch relocation.");
    SectionEntry &Section = Sections[Rel.SectionID];
--- a/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h
+++ b/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldImpl.h
@ -168,7 +168,7 @@ protected:
  inline unsigned getMaxStubSize() {
    if (Arch == Triple::arm || Arch == Triple::thumb)
      return 8; // 32-bit instruction and 32-bit address
-    else if (Arch == Triple::mipsel)
+    else if (Arch == Triple::mipsel || Arch == Triple::mips)
      return 16;
    else if (Arch == Triple::ppc64)
      return 44;
--- a/lib/MC/MCExpr.cpp
+++ b/lib/MC/MCExpr.cpp
@ -229,6 +229,10 @@ StringRef MCSymbolRefExpr::getVariantKindName(VariantKind Kind) {
  case VK_Mips_GOT_OFST: return "GOT_OFST";
  case VK_Mips_HIGHER:   return "HIGHER";
  case VK_Mips_HIGHEST:  return "HIGHEST";
+  case VK_Mips_GOT_HI16: return "GOT_HI16";
+  case VK_Mips_GOT_LO16: return "GOT_LO16";
+  case VK_Mips_CALL_HI16: return "CALL_HI16";
+  case VK_Mips_CALL_LO16: return "CALL_LO16";
  }
  llvm_unreachable("Invalid variant kind");
 }
--- a/lib/Target/Mips/InstPrinter/MipsInstPrinter.cpp
+++ b/lib/Target/Mips/InstPrinter/MipsInstPrinter.cpp
@ -128,6 +128,10 @@ static void printExpr(const MCExpr *Expr, raw_ostream &OS) {
  case MCSymbolRefExpr::VK_Mips_GOT_OFST:  OS << "%got_ofst("; break;
  case MCSymbolRefExpr::VK_Mips_HIGHER:    OS << "%higher("; break;
  case MCSymbolRefExpr::VK_Mips_HIGHEST:   OS << "%highest("; break;
+  case MCSymbolRefExpr::VK_Mips_GOT_HI16:  OS << "%got_hi("; break;
+  case MCSymbolRefExpr::VK_Mips_GOT_LO16:  OS << "%got_lo("; break;
+  case MCSymbolRefExpr::VK_Mips_CALL_HI16: OS << "%call_hi("; break;
+  case MCSymbolRefExpr::VK_Mips_CALL_LO16: OS << "%call_lo("; break;
  }

  OS << SRE->getSymbol();
--- a/lib/Target/Mips/MCTargetDesc/MipsAsmBackend.cpp
+++ b/lib/Target/Mips/MCTargetDesc/MipsAsmBackend.cpp
@ -42,6 +42,8 @@ static unsigned adjustFixupValue(unsigned Kind, uint64_t Value) {
  case Mips::fixup_Mips_GOT_PAGE:
  case Mips::fixup_Mips_GOT_OFST:
  case Mips::fixup_Mips_GOT_DISP:
+  case Mips::fixup_Mips_GOT_LO16:
+  case Mips::fixup_Mips_CALL_LO16:
    break;
  case Mips::fixup_Mips_PC16:
    // So far we are only using this type for branches.
@ -60,6 +62,8 @@ static unsigned adjustFixupValue(unsigned Kind, uint64_t Value) {
    break;
  case Mips::fixup_Mips_HI16:
  case Mips::fixup_Mips_GOT_Local:
+  case Mips::fixup_Mips_GOT_HI16:
+  case Mips::fixup_Mips_CALL_HI16:
    // Get the 2nd 16-bits. Also add 1 if bit 15 is 1.
    Value = ((Value + 0x8000) >> 16) & 0xffff;
    break;
@ -179,7 +183,11 @@ public:
      { "fixup_Mips_GOT_OFST",     0,     16,   0 },
      { "fixup_Mips_GOT_DISP",     0,     16,   0 },
      { "fixup_Mips_HIGHER",       0,     16,   0 },
-      { "fixup_Mips_HIGHEST",      0,     16,   0 }
+      { "fixup_Mips_HIGHEST",      0,     16,   0 },
+      { "fixup_Mips_GOT_HI16",     0,     16,   0 },
+      { "fixup_Mips_GOT_LO16",     0,     16,   0 },
+      { "fixup_Mips_CALL_HI16",    0,     16,   0 },
+      { "fixup_Mips_CALL_LO16",    0,     16,   0 }
    };

    if (Kind < FirstTargetFixupKind)
--- a/lib/Target/Mips/MCTargetDesc/MipsBaseInfo.h
+++ b/lib/Target/Mips/MCTargetDesc/MipsBaseInfo.h
@ -84,7 +84,13 @@ namespace MipsII {
    /// MO_HIGHER/HIGHEST - Represents the highest or higher half word of a
    /// 64-bit symbol address.
    MO_HIGHER,
-    MO_HIGHEST
+    MO_HIGHEST,
+
+    /// MO_GOT_HI16/LO16, MO_CALL_HI16/LO16 - Relocations used for large GOTs.
+    MO_GOT_HI16,
+    MO_GOT_LO16,
+    MO_CALL_HI16,
+    MO_CALL_LO16
  };

  enum {
--- a/lib/Target/Mips/MCTargetDesc/MipsELFObjectWriter.cpp
+++ b/lib/Target/Mips/MCTargetDesc/MipsELFObjectWriter.cpp
@ -179,6 +179,18 @@ unsigned MipsELFObjectWriter::GetRelocType(const MCValue &Target,
  case Mips::fixup_Mips_HIGHEST:
    Type = ELF::R_MIPS_HIGHEST;
    break;
+  case Mips::fixup_Mips_GOT_HI16:
+    Type = ELF::R_MIPS_GOT_HI16;
+    break;
+  case Mips::fixup_Mips_GOT_LO16:
+    Type = ELF::R_MIPS_GOT_LO16;
+    break;
+  case Mips::fixup_Mips_CALL_HI16:
+    Type = ELF::R_MIPS_CALL_HI16;
+    break;
+  case Mips::fixup_Mips_CALL_LO16:
+    Type = ELF::R_MIPS_CALL_LO16;
+    break;
  }
  return Type;
 }
--- a/lib/Target/Mips/MCTargetDesc/MipsFixupKinds.h
+++ b/lib/Target/Mips/MCTargetDesc/MipsFixupKinds.h
@ -116,6 +116,18 @@ namespace Mips {
    // resulting in - R_MIPS_HIGHEST
    fixup_Mips_HIGHEST,

+    // resulting in - R_MIPS_GOT_HI16
+    fixup_Mips_GOT_HI16,
+
+    // resulting in - R_MIPS_GOT_LO16
+    fixup_Mips_GOT_LO16,
+
+    // resulting in - R_MIPS_CALL_HI16
+    fixup_Mips_CALL_HI16,
+
+    // resulting in - R_MIPS_CALL_LO16
+    fixup_Mips_CALL_LO16,
+
    // Marker
    LastTargetFixupKind,
    NumTargetFixupKinds = LastTargetFixupKind - FirstTargetFixupKind
--- a/lib/Target/Mips/MCTargetDesc/MipsMCCodeEmitter.cpp
+++ b/lib/Target/Mips/MCTargetDesc/MipsMCCodeEmitter.cpp
@ -287,6 +287,18 @@ getMachineOpValue(const MCInst &MI, const MCOperand &MO,
  case MCSymbolRefExpr::VK_Mips_HIGHEST:
    FixupKind = Mips::fixup_Mips_HIGHEST;
    break;
+  case MCSymbolRefExpr::VK_Mips_GOT_HI16:
+    FixupKind = Mips::fixup_Mips_GOT_HI16;
+    break;
+  case MCSymbolRefExpr::VK_Mips_GOT_LO16:
+    FixupKind = Mips::fixup_Mips_GOT_LO16;
+    break;
+  case MCSymbolRefExpr::VK_Mips_CALL_HI16:
+    FixupKind = Mips::fixup_Mips_CALL_HI16;
+    break;
+  case MCSymbolRefExpr::VK_Mips_CALL_LO16:
+    FixupKind = Mips::fixup_Mips_CALL_LO16;
+    break;
  } // switch

  Fixups.push_back(MCFixup::Create(0, MO.getExpr(), MCFixupKind(FixupKind)));
--- a/lib/Target/Mips/Mips64InstrInfo.td
+++ b/lib/Target/Mips/Mips64InstrInfo.td
@ -255,6 +255,7 @@ def : MipsPat<(MipsHi tblockaddress:$in), (LUi64 tblockaddress:$in)>;
 def : MipsPat<(MipsHi tjumptable:$in), (LUi64 tjumptable:$in)>;
 def : MipsPat<(MipsHi tconstpool:$in), (LUi64 tconstpool:$in)>;
 def : MipsPat<(MipsHi tglobaltlsaddr:$in), (LUi64 tglobaltlsaddr:$in)>;
+def : MipsPat<(MipsHi texternalsym:$in), (LUi64 texternalsym:$in)>;

 def : MipsPat<(MipsLo tglobaladdr:$in), (DADDiu ZERO_64, tglobaladdr:$in)>;
 def : MipsPat<(MipsLo tblockaddress:$in), (DADDiu ZERO_64, tblockaddress:$in)>;
@ -262,6 +263,7 @@ def : MipsPat<(MipsLo tjumptable:$in), (DADDiu ZERO_64, tjumptable:$in)>;
 def : MipsPat<(MipsLo tconstpool:$in), (DADDiu ZERO_64, tconstpool:$in)>;
 def : MipsPat<(MipsLo tglobaltlsaddr:$in),
              (DADDiu ZERO_64, tglobaltlsaddr:$in)>;
+def : MipsPat<(MipsLo texternalsym:$in), (DADDiu ZERO_64, texternalsym:$in)>;

 def : MipsPat<(add CPU64Regs:$hi, (MipsLo tglobaladdr:$lo)),
              (DADDiu CPU64Regs:$hi, tglobaladdr:$lo)>;
--- a/lib/Target/Mips/MipsCodeEmitter.cpp
+++ b/lib/Target/Mips/MipsCodeEmitter.cpp
@ -85,7 +85,7 @@ class MipsCodeEmitter : public MachineFunctionPass {

  private:

-    void emitWordLE(unsigned Word);
+    void emitWord(unsigned Word);

    /// Routines that handle operands which add machine relocations which are
    /// fixed up by the relocation stage.
@ -112,12 +112,6 @@ class MipsCodeEmitter : public MachineFunctionPass {
    unsigned getSizeExtEncoding(const MachineInstr &MI, unsigned OpNo) const;
    unsigned getSizeInsEncoding(const MachineInstr &MI, unsigned OpNo) const;

-    int emitULW(const MachineInstr &MI);
-    int emitUSW(const MachineInstr &MI);
-    int emitULH(const MachineInstr &MI);
-    int emitULHu(const MachineInstr &MI);
-    int emitUSH(const MachineInstr &MI);
-
    void emitGlobalAddressUnaligned(const GlobalValue *GV, unsigned Reloc,
                                    int Offset) const;
  };
@ -133,7 +127,7 @@ bool MipsCodeEmitter::runOnMachineFunction(MachineFunction &MF) {
  MCPEs = &MF.getConstantPool()->getConstants();
  MJTEs = 0;
  if (MF.getJumpTableInfo()) MJTEs = &MF.getJumpTableInfo()->getJumpTables();
-  JTI->Initialize(MF, IsPIC);
+  JTI->Initialize(MF, IsPIC, Subtarget->isLittle());
  MCE.setModuleInfo(&getAnalysis<MachineModuleInfo> ());

  do {
@ -271,103 +265,6 @@ void MipsCodeEmitter::emitMachineBasicBlock(MachineBasicBlock *BB,
                                             Reloc, BB));
 }

-int MipsCodeEmitter::emitUSW(const MachineInstr &MI) {
-  unsigned src = getMachineOpValue(MI, MI.getOperand(0));
-  unsigned base = getMachineOpValue(MI, MI.getOperand(1));
-  unsigned offset = getMachineOpValue(MI, MI.getOperand(2));
-  // swr src, offset(base)
-  // swl src, offset+3(base)
-  MCE.emitWordLE(
-    (0x2e << 26) | (base << 21) | (src << 16) | (offset & 0xffff));
-  MCE.emitWordLE(
-    (0x2a << 26) | (base << 21) | (src << 16) | ((offset+3) & 0xffff));
-  return 2;
-}
-
-int MipsCodeEmitter::emitULW(const MachineInstr &MI) {
-  unsigned dst = getMachineOpValue(MI, MI.getOperand(0));
-  unsigned base = getMachineOpValue(MI, MI.getOperand(1));
-  unsigned offset = getMachineOpValue(MI, MI.getOperand(2));
-  unsigned at = 1;
-  if (dst != base) {
-    // lwr dst, offset(base)
-    // lwl dst, offset+3(base)
-    MCE.emitWordLE(
-      (0x26 << 26) | (base << 21) | (dst << 16) | (offset & 0xffff));
-    MCE.emitWordLE(
-      (0x22 << 26) | (base << 21) | (dst << 16) | ((offset+3) & 0xffff));
-    return 2;
-  } else {
-    // lwr at, offset(base)
-    // lwl at, offset+3(base)
-    // addu dst, at, $zero
-    MCE.emitWordLE(
-      (0x26 << 26) | (base << 21) | (at << 16) | (offset & 0xffff));
-    MCE.emitWordLE(
-      (0x22 << 26) | (base << 21) | (at << 16) | ((offset+3) & 0xffff));
-    MCE.emitWordLE(
-      (0x0 << 26) | (at << 21) | (0x0 << 16) | (dst << 11) | (0x0 << 6) | 0x21);
-    return 3;
-  }
-}
-
-int MipsCodeEmitter::emitUSH(const MachineInstr &MI) {
-  unsigned src = getMachineOpValue(MI, MI.getOperand(0));
-  unsigned base = getMachineOpValue(MI, MI.getOperand(1));
-  unsigned offset = getMachineOpValue(MI, MI.getOperand(2));
-  unsigned at = 1;
-  // sb src, offset(base)
-  // srl at,src,8
-  // sb at, offset+1(base)
-  MCE.emitWordLE(
-    (0x28 << 26) | (base << 21) | (src << 16) | (offset & 0xffff));
-  MCE.emitWordLE(
-    (0x0 << 26) | (0x0 << 21) | (src << 16) | (at << 11) | (0x8 << 6) | 0x2);
-  MCE.emitWordLE(
-    (0x28 << 26) | (base << 21) | (at << 16) | ((offset+1) & 0xffff));
-  return 3;
-}
-
-int MipsCodeEmitter::emitULH(const MachineInstr &MI) {
-  unsigned dst = getMachineOpValue(MI, MI.getOperand(0));
-  unsigned base = getMachineOpValue(MI, MI.getOperand(1));
-  unsigned offset = getMachineOpValue(MI, MI.getOperand(2));
-  unsigned at = 1;
-  // lbu at, offset(base)
-  // lb dst, offset+1(base)
-  // sll dst,dst,8
-  // or dst,dst,at
-  MCE.emitWordLE(
-    (0x24 << 26) | (base << 21) | (at << 16) | (offset & 0xffff));
-  MCE.emitWordLE(
-    (0x20 << 26) | (base << 21) | (dst << 16) | ((offset+1) & 0xffff));
-  MCE.emitWordLE(
-    (0x0 << 26) | (0x0 << 21) | (dst << 16) | (dst << 11) | (0x8 << 6) | 0x0);
-  MCE.emitWordLE(
-    (0x0 << 26) | (dst << 21) | (at << 16) | (dst << 11) | (0x0 << 6) | 0x25);
-  return 4;
-}
-
-int MipsCodeEmitter::emitULHu(const MachineInstr &MI) {
-  unsigned dst = getMachineOpValue(MI, MI.getOperand(0));
-  unsigned base = getMachineOpValue(MI, MI.getOperand(1));
-  unsigned offset = getMachineOpValue(MI, MI.getOperand(2));
-  unsigned at = 1;
-  // lbu at, offset(base)
-  // lbu dst, offset+1(base)
-  // sll dst,dst,8
-  // or dst,dst,at
-  MCE.emitWordLE(
-    (0x24 << 26) | (base << 21) | (at << 16) | (offset & 0xffff));
-  MCE.emitWordLE(
-    (0x24 << 26) | (base << 21) | (dst << 16) | ((offset+1) & 0xffff));
-  MCE.emitWordLE(
-    (0x0 << 26) | (0x0 << 21) | (dst << 16) | (dst << 11) | (0x8 << 6) | 0x0);
-  MCE.emitWordLE(
-    (0x0 << 26) | (dst << 21) | (at << 16) | (dst << 11) | (0x0 << 6) | 0x25);
-  return 4;
-}
-
 void MipsCodeEmitter::emitInstruction(const MachineInstr &MI) {
  DEBUG(errs() << "JIT: " << (void*)MCE.getCurrentPCValue() << ":\t" << MI);

@ -377,16 +274,19 @@ void MipsCodeEmitter::emitInstruction(const MachineInstr &MI) {
  if ((MI.getDesc().TSFlags & MipsII::FormMask) == MipsII::Pseudo)
    return;

-  emitWordLE(getBinaryCodeForInstr(MI));
+  emitWord(getBinaryCodeForInstr(MI));
  ++NumEmitted;  // Keep track of the # of mi's emitted

  MCE.processDebugLoc(MI.getDebugLoc(), false);
 }

-void MipsCodeEmitter::emitWordLE(unsigned Word) {
+void MipsCodeEmitter::emitWord(unsigned Word) {
  DEBUG(errs() << "  0x";
        errs().write_hex(Word) << "\n");
-  MCE.emitWordLE(Word);
+  if (Subtarget->isLittle())
+    MCE.emitWordLE(Word);
+  else
+    MCE.emitWordBE(Word);
 }

 /// createMipsJITCodeEmitterPass - Return a pass that emits the collected Mips
--- a/lib/Target/Mips/MipsISelLowering.cpp
+++ b/lib/Target/Mips/MipsISelLowering.cpp
@ -46,6 +46,10 @@ static cl::opt<bool>
 EnableMipsTailCalls("enable-mips-tail-calls", cl::Hidden,
                    cl::desc("MIPS: Enable tail calls."), cl::init(false));

+static cl::opt<bool>
+LargeGOT("mxgot", cl::Hidden,
+         cl::desc("MIPS: Enable GOT larger than 64k."), cl::init(false));
+
 static const uint16_t O32IntRegs[4] = {
  Mips::A0, Mips::A1, Mips::A2, Mips::A3
 };
@ -77,6 +81,71 @@ static SDValue GetGlobalReg(SelectionDAG &DAG, EVT Ty) {
  return DAG.getRegister(FI->getGlobalBaseReg(), Ty);
 }

+static SDValue getTargetNode(SDValue Op, SelectionDAG &DAG, unsigned Flag) {
+  EVT Ty = Op.getValueType();
+
+  if (GlobalAddressSDNode *N = dyn_cast<GlobalAddressSDNode>(Op))
+    return DAG.getTargetGlobalAddress(N->getGlobal(), Op.getDebugLoc(), Ty, 0,
+                                      Flag);
+  if (ExternalSymbolSDNode *N = dyn_cast<ExternalSymbolSDNode>(Op))
+    return DAG.getTargetExternalSymbol(N->getSymbol(), Ty, Flag);
+  if (BlockAddressSDNode *N = dyn_cast<BlockAddressSDNode>(Op))
+    return DAG.getTargetBlockAddress(N->getBlockAddress(), Ty, 0, Flag);
+  if (JumpTableSDNode *N = dyn_cast<JumpTableSDNode>(Op))
+    return DAG.getTargetJumpTable(N->getIndex(), Ty, Flag);
+  if (ConstantPoolSDNode *N = dyn_cast<ConstantPoolSDNode>(Op))
+    return DAG.getTargetConstantPool(N->getConstVal(), Ty, N->getAlignment(),
+                                     N->getOffset(), Flag);
+
+  llvm_unreachable("Unexpected node type.");
+  return SDValue();
+}
+
+static SDValue getAddrNonPIC(SDValue Op, SelectionDAG &DAG) {
+  DebugLoc DL = Op.getDebugLoc();
+  EVT Ty = Op.getValueType();
+  SDValue Hi = getTargetNode(Op, DAG, MipsII::MO_ABS_HI);
+  SDValue Lo = getTargetNode(Op, DAG, MipsII::MO_ABS_LO);
+  return DAG.getNode(ISD::ADD, DL, Ty,
+                     DAG.getNode(MipsISD::Hi, DL, Ty, Hi),
+                     DAG.getNode(MipsISD::Lo, DL, Ty, Lo));
+}
+
+static SDValue getAddrLocal(SDValue Op, SelectionDAG &DAG, bool HasMips64) {
+  DebugLoc DL = Op.getDebugLoc();
+  EVT Ty = Op.getValueType();
+  unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT;
+  SDValue GOT = DAG.getNode(MipsISD::Wrapper, DL, Ty, GetGlobalReg(DAG, Ty),
+                            getTargetNode(Op, DAG, GOTFlag));
+  SDValue Load = DAG.getLoad(Ty, DL, DAG.getEntryNode(), GOT,
+                             MachinePointerInfo::getGOT(), false, false, false,
+                             0);
+  unsigned LoFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO;
+  SDValue Lo = DAG.getNode(MipsISD::Lo, DL, Ty, getTargetNode(Op, DAG, LoFlag));
+  return DAG.getNode(ISD::ADD, DL, Ty, Load, Lo);
+}
+
+static SDValue getAddrGlobal(SDValue Op, SelectionDAG &DAG, unsigned Flag) {
+  DebugLoc DL = Op.getDebugLoc();
+  EVT Ty = Op.getValueType();
+  SDValue Tgt = DAG.getNode(MipsISD::Wrapper, DL, Ty, GetGlobalReg(DAG, Ty),
+                            getTargetNode(Op, DAG, Flag));
+  return DAG.getLoad(Ty, DL, DAG.getEntryNode(), Tgt,
+                     MachinePointerInfo::getGOT(), false, false, false, 0);
+}
+
+static SDValue getAddrGlobalLargeGOT(SDValue Op, SelectionDAG &DAG,
+                                     unsigned HiFlag, unsigned LoFlag) {
+  DebugLoc DL = Op.getDebugLoc();
+  EVT Ty = Op.getValueType();
+  SDValue Hi = DAG.getNode(MipsISD::Hi, DL, Ty, getTargetNode(Op, DAG, HiFlag));
+  Hi = DAG.getNode(ISD::ADD, DL, Ty, Hi, GetGlobalReg(DAG, Ty));
+  SDValue Wrapper = DAG.getNode(MipsISD::Wrapper, DL, Ty, Hi,
+                                getTargetNode(Op, DAG, LoFlag));
+  return DAG.getLoad(Ty, DL, DAG.getEntryNode(), Wrapper,
+                     MachinePointerInfo::getGOT(), false, false, false, 0);
+}
+
 const char *MipsTargetLowering::getTargetNodeName(unsigned Opcode) const {
  switch (Opcode) {
  case MipsISD::JmpLink:           return "MipsISD::JmpLink";
@ -1743,8 +1812,6 @@ SDValue MipsTargetLowering::LowerGlobalAddress(SDValue Op,
  const GlobalValue *GV = cast<GlobalAddressSDNode>(Op)->getGlobal();

  if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64) {
-    SDVTList VTs = DAG.getVTList(MVT::i32);
-
    const MipsTargetObjectFile &TLOF =
      (const MipsTargetObjectFile&)getObjFileLowering();

@ -1752,69 +1819,33 @@ SDValue MipsTargetLowering::LowerGlobalAddress(SDValue Op,
    if (TLOF.IsGlobalInSmallSection(GV, getTargetMachine())) {
      SDValue GA = DAG.getTargetGlobalAddress(GV, dl, MVT::i32, 0,
                                              MipsII::MO_GPREL);
-      SDValue GPRelNode = DAG.getNode(MipsISD::GPRel, dl, VTs, &GA, 1);
+      SDValue GPRelNode = DAG.getNode(MipsISD::GPRel, dl,
+                                      DAG.getVTList(MVT::i32), &GA, 1);
      SDValue GPReg = DAG.getRegister(Mips::GP, MVT::i32);
      return DAG.getNode(ISD::ADD, dl, MVT::i32, GPReg, GPRelNode);
    }
+
    // %hi/%lo relocation
-    SDValue GAHi = DAG.getTargetGlobalAddress(GV, dl, MVT::i32, 0,
-                                              MipsII::MO_ABS_HI);
-    SDValue GALo = DAG.getTargetGlobalAddress(GV, dl, MVT::i32, 0,
-                                              MipsII::MO_ABS_LO);
-    SDValue HiPart = DAG.getNode(MipsISD::Hi, dl, VTs, &GAHi, 1);
-    SDValue Lo = DAG.getNode(MipsISD::Lo, dl, MVT::i32, GALo);
-    return DAG.getNode(ISD::ADD, dl, MVT::i32, HiPart, Lo);
+    return getAddrNonPIC(Op, DAG);
  }

-  EVT ValTy = Op.getValueType();
-  bool HasGotOfst = (GV->hasInternalLinkage() ||
-                     (GV->hasLocalLinkage() && !isa<Function>(GV)));
-  unsigned GotFlag = HasMips64 ?
-                     (HasGotOfst ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT_DISP) :
-                     (HasGotOfst ? MipsII::MO_GOT : MipsII::MO_GOT16);
-  SDValue GA = DAG.getTargetGlobalAddress(GV, dl, ValTy, 0, GotFlag);
-  GA = DAG.getNode(MipsISD::Wrapper, dl, ValTy, GetGlobalReg(DAG, ValTy), GA);
-  SDValue ResNode = DAG.getLoad(ValTy, dl, DAG.getEntryNode(), GA,
-                                MachinePointerInfo(), false, false, false, 0);
-  // On functions and global targets not internal linked only
-  // a load from got/GP is necessary for PIC to work.
-  if (!HasGotOfst)
-    return ResNode;
-  SDValue GALo = DAG.getTargetGlobalAddress(GV, dl, ValTy, 0,
-                                            HasMips64 ? MipsII::MO_GOT_OFST :
-                                                        MipsII::MO_ABS_LO);
-  SDValue Lo = DAG.getNode(MipsISD::Lo, dl, ValTy, GALo);
-  return DAG.getNode(ISD::ADD, dl, ValTy, ResNode, Lo);
+  if (GV->hasInternalLinkage() || (GV->hasLocalLinkage() && !isa<Function>(GV)))
+    return getAddrLocal(Op, DAG, HasMips64);
+
+  if (LargeGOT)
+    return getAddrGlobalLargeGOT(Op, DAG, MipsII::MO_GOT_HI16,
+                                 MipsII::MO_GOT_LO16);
+
+  return getAddrGlobal(Op, DAG,
+                       HasMips64 ? MipsII::MO_GOT_DISP : MipsII::MO_GOT16);
 }

 SDValue MipsTargetLowering::LowerBlockAddress(SDValue Op,
                                              SelectionDAG &DAG) const {
-  const BlockAddress *BA = cast<BlockAddressSDNode>(Op)->getBlockAddress();
-  // FIXME there isn't actually debug info here
-  DebugLoc dl = Op.getDebugLoc();
+  if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64)
+    return getAddrNonPIC(Op, DAG);

-  if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64) {
-    // %hi/%lo relocation
-    SDValue BAHi =
-      DAG.getTargetBlockAddress(BA, MVT::i32, 0, MipsII::MO_ABS_HI);
-    SDValue BALo =
-      DAG.getTargetBlockAddress(BA, MVT::i32, 0, MipsII::MO_ABS_LO);
-    SDValue Hi = DAG.getNode(MipsISD::Hi, dl, MVT::i32, BAHi);
-    SDValue Lo = DAG.getNode(MipsISD::Lo, dl, MVT::i32, BALo);
-    return DAG.getNode(ISD::ADD, dl, MVT::i32, Hi, Lo);
-  }
-
-  EVT ValTy = Op.getValueType();
-  unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT;
-  unsigned OFSTFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO;
-  SDValue BAGOTOffset = DAG.getTargetBlockAddress(BA, ValTy, 0, GOTFlag);
-  BAGOTOffset = DAG.getNode(MipsISD::Wrapper, dl, ValTy,
-                            GetGlobalReg(DAG, ValTy), BAGOTOffset);
-  SDValue BALOOffset = DAG.getTargetBlockAddress(BA, ValTy, 0, OFSTFlag);
-  SDValue Load = DAG.getLoad(ValTy, dl, DAG.getEntryNode(), BAGOTOffset,
-                             MachinePointerInfo(), false, false, false, 0);
-  SDValue Lo = DAG.getNode(MipsISD::Lo, dl, ValTy, BALOOffset);
-  return DAG.getNode(ISD::ADD, dl, ValTy, Load, Lo);
+  return getAddrLocal(Op, DAG, HasMips64);
 }

 SDValue MipsTargetLowering::
@ -1901,41 +1932,15 @@ LowerGlobalTLSAddress(SDValue Op, SelectionDAG &DAG) const
 SDValue MipsTargetLowering::
 LowerJumpTable(SDValue Op, SelectionDAG &DAG) const
 {
-  SDValue HiPart, JTI, JTILo;
-  // FIXME there isn't actually debug info here
-  DebugLoc dl = Op.getDebugLoc();
-  bool IsPIC = getTargetMachine().getRelocationModel() == Reloc::PIC_;
-  EVT PtrVT = Op.getValueType();
-  JumpTableSDNode *JT = cast<JumpTableSDNode>(Op);
+  if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64)
+    return getAddrNonPIC(Op, DAG);

-  if (!IsPIC && !IsN64) {
-    JTI = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, MipsII::MO_ABS_HI);
-    HiPart = DAG.getNode(MipsISD::Hi, dl, PtrVT, JTI);
-    JTILo = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, MipsII::MO_ABS_LO);
-  } else {// Emit Load from Global Pointer
-    unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT;
-    unsigned OfstFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO;
-    JTI = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, GOTFlag);
-    JTI = DAG.getNode(MipsISD::Wrapper, dl, PtrVT, GetGlobalReg(DAG, PtrVT),
-                      JTI);
-    HiPart = DAG.getLoad(PtrVT, dl, DAG.getEntryNode(), JTI,
-                         MachinePointerInfo(), false, false, false, 0);
-    JTILo = DAG.getTargetJumpTable(JT->getIndex(), PtrVT, OfstFlag);
-  }
-
-  SDValue Lo = DAG.getNode(MipsISD::Lo, dl, PtrVT, JTILo);
-  return DAG.getNode(ISD::ADD, dl, PtrVT, HiPart, Lo);
+  return getAddrLocal(Op, DAG, HasMips64);
 }

 SDValue MipsTargetLowering::
 LowerConstantPool(SDValue Op, SelectionDAG &DAG) const
 {
-  SDValue ResNode;
-  ConstantPoolSDNode *N = cast<ConstantPoolSDNode>(Op);
-  const Constant *C = N->getConstVal();
-  // FIXME there isn't actually debug info here
-  DebugLoc dl = Op.getDebugLoc();
-
  // gp_rel relocation
  // FIXME: we should reference the constant pool using small data sections,
  // but the asm printer currently doesn't support this feature without
@ -1946,31 +1951,10 @@ LowerConstantPool(SDValue Op, SelectionDAG &DAG) const
  //  SDValue GOT = DAG.getGLOBAL_OFFSET_TABLE(MVT::i32);
  //  ResNode = DAG.getNode(ISD::ADD, MVT::i32, GOT, GPRelNode);

-  if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64) {
-    SDValue CPHi = DAG.getTargetConstantPool(C, MVT::i32, N->getAlignment(),
-                                             N->getOffset(), MipsII::MO_ABS_HI);
-    SDValue CPLo = DAG.getTargetConstantPool(C, MVT::i32, N->getAlignment(),
-                                             N->getOffset(), MipsII::MO_ABS_LO);
-    SDValue HiPart = DAG.getNode(MipsISD::Hi, dl, MVT::i32, CPHi);
-    SDValue Lo = DAG.getNode(MipsISD::Lo, dl, MVT::i32, CPLo);
-    ResNode = DAG.getNode(ISD::ADD, dl, MVT::i32, HiPart, Lo);
-  } else {
-    EVT ValTy = Op.getValueType();
-    unsigned GOTFlag = HasMips64 ? MipsII::MO_GOT_PAGE : MipsII::MO_GOT;
-    unsigned OFSTFlag = HasMips64 ? MipsII::MO_GOT_OFST : MipsII::MO_ABS_LO;
-    SDValue CP = DAG.getTargetConstantPool(C, ValTy, N->getAlignment(),
-                                           N->getOffset(), GOTFlag);
-    CP = DAG.getNode(MipsISD::Wrapper, dl, ValTy, GetGlobalReg(DAG, ValTy), CP);
-    SDValue Load = DAG.getLoad(ValTy, dl, DAG.getEntryNode(), CP,
-                               MachinePointerInfo::getConstantPool(), false,
-                               false, false, 0);
-    SDValue CPLo = DAG.getTargetConstantPool(C, ValTy, N->getAlignment(),
-                                             N->getOffset(), OFSTFlag);
-    SDValue Lo = DAG.getNode(MipsISD::Lo, dl, ValTy, CPLo);
-    ResNode = DAG.getNode(ISD::ADD, dl, ValTy, Load, Lo);
-  }
+  if (getTargetMachine().getRelocationModel() != Reloc::PIC_ && !IsN64)
+    return getAddrNonPIC(Op, DAG);

-  return ResNode;
+  return getAddrLocal(Op, DAG, HasMips64);
 }

 SDValue MipsTargetLowering::LowerVASTART(SDValue Op, SelectionDAG &DAG) const {
@ -2862,60 +2846,41 @@ MipsTargetLowering::LowerCall(TargetLowering::CallLoweringInfo &CLI,
  // If the callee is a GlobalAddress/ExternalSymbol node (quite common, every
  // direct call is) turn it into a TargetGlobalAddress/TargetExternalSymbol
  // node so that legalize doesn't hack it.
-  unsigned char OpFlag;
  bool IsPICCall = (IsN64 || IsPIC); // true if calls are translated to jalr $25
  bool GlobalOrExternal = false;
  SDValue CalleeLo;

  if (GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee)) {
-    if (IsPICCall && G->getGlobal()->hasInternalLinkage()) {
-      OpFlag = IsO32 ? MipsII::MO_GOT : MipsII::MO_GOT_PAGE;
-      unsigned char LoFlag = IsO32 ? MipsII::MO_ABS_LO : MipsII::MO_GOT_OFST;
+    if (IsPICCall) {
+      if (G->getGlobal()->hasInternalLinkage())
+        Callee = getAddrLocal(Callee, DAG, HasMips64);
+      else if (LargeGOT)
+        Callee = getAddrGlobalLargeGOT(Callee, DAG, MipsII::MO_CALL_HI16,
+                                       MipsII::MO_CALL_LO16);
+      else
+        Callee = getAddrGlobal(Callee, DAG, MipsII::MO_GOT_CALL);
+    } else
      Callee = DAG.getTargetGlobalAddress(G->getGlobal(), dl, getPointerTy(), 0,
-                                          OpFlag);
-      CalleeLo = DAG.getTargetGlobalAddress(G->getGlobal(), dl, getPointerTy(),
-                                            0, LoFlag);
-    } else {
-      OpFlag = IsPICCall ? MipsII::MO_GOT_CALL : MipsII::MO_NO_FLAG;
-      Callee = DAG.getTargetGlobalAddress(G->getGlobal(), dl,
-                                          getPointerTy(), 0, OpFlag);
-    }
-
+                                          MipsII::MO_NO_FLAG);
    GlobalOrExternal = true;
  }
  else if (ExternalSymbolSDNode *S = dyn_cast<ExternalSymbolSDNode>(Callee)) {
-    if (IsN64 || (!IsO32 && IsPIC))
-      OpFlag = MipsII::MO_GOT_DISP;
-    else if (!IsPIC) // !N64 && static
-      OpFlag = MipsII::MO_NO_FLAG;
+    if (!IsN64 && !IsPIC) // !N64 && static
+      Callee = DAG.getTargetExternalSymbol(S->getSymbol(), getPointerTy(),
+                                            MipsII::MO_NO_FLAG);
+    else if (LargeGOT)
+      Callee = getAddrGlobalLargeGOT(Callee, DAG, MipsII::MO_CALL_HI16,
+                                     MipsII::MO_CALL_LO16);
+    else if (HasMips64)
+      Callee = getAddrGlobal(Callee, DAG, MipsII::MO_GOT_DISP);
    else // O32 & PIC
-      OpFlag = MipsII::MO_GOT_CALL;
-    Callee = DAG.getTargetExternalSymbol(S->getSymbol(), getPointerTy(),
-                                         OpFlag);
+      Callee = getAddrGlobal(Callee, DAG, MipsII::MO_GOT_CALL);
+
    GlobalOrExternal = true;
  }

  SDValue InFlag;

-  // Create nodes that load address of callee and copy it to T9
-  if (IsPICCall) {
-    if (GlobalOrExternal) {
-      // Load callee address
-      Callee = DAG.getNode(MipsISD::Wrapper, dl, getPointerTy(),
-                           GetGlobalReg(DAG, getPointerTy()), Callee);
-      SDValue LoadValue = DAG.getLoad(getPointerTy(), dl, DAG.getEntryNode(),
-                                      Callee, MachinePointerInfo::getGOT(),
-                                      false, false, false, 0);
-
-      // Use GOT+LO if callee has internal linkage.
-      if (CalleeLo.getNode()) {
-        SDValue Lo = DAG.getNode(MipsISD::Lo, dl, getPointerTy(), CalleeLo);
-        Callee = DAG.getNode(ISD::ADD, dl, getPointerTy(), LoadValue, Lo);
-      } else
-        Callee = LoadValue;
-    }
-  }
-
  // T9 register operand.
  SDValue T9;

--- a/lib/Target/Mips/MipsInstrInfo.td
+++ b/lib/Target/Mips/MipsInstrInfo.td
@ -1154,12 +1154,14 @@ def : MipsPat<(MipsHi tblockaddress:$in), (LUi tblockaddress:$in)>;
 def : MipsPat<(MipsHi tjumptable:$in), (LUi tjumptable:$in)>;
 def : MipsPat<(MipsHi tconstpool:$in), (LUi tconstpool:$in)>;
 def : MipsPat<(MipsHi tglobaltlsaddr:$in), (LUi tglobaltlsaddr:$in)>;
+def : MipsPat<(MipsHi texternalsym:$in), (LUi texternalsym:$in)>;

 def : MipsPat<(MipsLo tglobaladdr:$in), (ADDiu ZERO, tglobaladdr:$in)>;
 def : MipsPat<(MipsLo tblockaddress:$in), (ADDiu ZERO, tblockaddress:$in)>;
 def : MipsPat<(MipsLo tjumptable:$in), (ADDiu ZERO, tjumptable:$in)>;
 def : MipsPat<(MipsLo tconstpool:$in), (ADDiu ZERO, tconstpool:$in)>;
 def : MipsPat<(MipsLo tglobaltlsaddr:$in), (ADDiu ZERO, tglobaltlsaddr:$in)>;
+def : MipsPat<(MipsLo texternalsym:$in), (ADDiu ZERO, texternalsym:$in)>;

 def : MipsPat<(add CPURegs:$hi, (MipsLo tglobaladdr:$lo)),
              (ADDiu CPURegs:$hi, tglobaladdr:$lo)>;
--- a/lib/Target/Mips/MipsJITInfo.cpp
+++ b/lib/Target/Mips/MipsJITInfo.cpp
@ -222,10 +222,17 @@ void *MipsJITInfo::emitFunctionStub(const Function *F, void *Fn,
  // addiu t9, t9, %lo(EmittedAddr)
  // jalr t8, t9
  // nop
-  JCE.emitWordLE(0xf << 26 | 25 << 16 | Hi);
-  JCE.emitWordLE(9 << 26 | 25 << 21 | 25 << 16 | Lo);
-  JCE.emitWordLE(25 << 21 | 24 << 11 | 9);
-  JCE.emitWordLE(0);
+  if (IsLittleEndian) {
+    JCE.emitWordLE(0xf << 26 | 25 << 16 | Hi);
+    JCE.emitWordLE(9 << 26 | 25 << 21 | 25 << 16 | Lo);
+    JCE.emitWordLE(25 << 21 | 24 << 11 | 9);
+    JCE.emitWordLE(0);
+  } else {
+    JCE.emitWordBE(0xf << 26 | 25 << 16 | Hi);
+    JCE.emitWordBE(9 << 26 | 25 << 21 | 25 << 16 | Lo);
+    JCE.emitWordBE(25 << 21 | 24 << 11 | 9);
+    JCE.emitWordBE(0);
+  }

  sys::Memory::InvalidateInstructionCache(Addr, 16);
  if (!sys::Memory::setRangeExecutable(Addr, 16))
--- a/lib/Target/Mips/MipsJITInfo.h
+++ b/lib/Target/Mips/MipsJITInfo.h
@ -26,10 +26,11 @@ class MipsTargetMachine;
 class MipsJITInfo : public TargetJITInfo {

  bool IsPIC;
+  bool IsLittleEndian;

  public:
    explicit MipsJITInfo() :
-      IsPIC(false) {}
+      IsPIC(false), IsLittleEndian(true) {}

    /// replaceMachineCodeForFunction - Make it so that calling the function
    /// whose machine code is at OLD turns into a call to NEW, perhaps by
@ -58,8 +59,10 @@ class MipsJITInfo : public TargetJITInfo {
                          unsigned NumRelocs, unsigned char *GOTBase);

    /// Initialize - Initialize internal stage for the function being JITted.
-    void Initialize(const MachineFunction &MF, bool isPIC) {
+    void Initialize(const MachineFunction &MF, bool isPIC,
+                    bool isLittleEndian) {
      IsPIC = isPIC;
+      IsLittleEndian = isLittleEndian;
    }

 };
--- a/lib/Target/Mips/MipsMCInstLower.cpp
+++ b/lib/Target/Mips/MipsMCInstLower.cpp
@ -62,6 +62,10 @@ MCOperand MipsMCInstLower::LowerSymbolOperand(const MachineOperand &MO,
  case MipsII::MO_GOT_OFST:  Kind = MCSymbolRefExpr::VK_Mips_GOT_OFST; break;
  case MipsII::MO_HIGHER:    Kind = MCSymbolRefExpr::VK_Mips_HIGHER; break;
  case MipsII::MO_HIGHEST:   Kind = MCSymbolRefExpr::VK_Mips_HIGHEST; break;
+  case MipsII::MO_GOT_HI16:  Kind = MCSymbolRefExpr::VK_Mips_GOT_HI16; break;
+  case MipsII::MO_GOT_LO16:  Kind = MCSymbolRefExpr::VK_Mips_GOT_LO16; break;
+  case MipsII::MO_CALL_HI16: Kind = MCSymbolRefExpr::VK_Mips_CALL_HI16; break;
+  case MipsII::MO_CALL_LO16: Kind = MCSymbolRefExpr::VK_Mips_CALL_LO16; break;
  }

  switch (MOTy) {
--- a/lib/Transforms/Scalar/SROA.cpp
+++ b/lib/Transforms/Scalar/SROA.cpp
@ -2160,6 +2160,9 @@ static bool isIntegerWideningViable(const DataLayout &TD,
                                    AllocaPartitioning::const_use_iterator I,
                                    AllocaPartitioning::const_use_iterator E) {
  uint64_t SizeInBits = TD.getTypeSizeInBits(AllocaTy);
+  // Don't create integer types larger than the maximum bitwidth.
+  if (SizeInBits > IntegerType::MAX_INT_BITS)
+    return false;

  // Don't try to handle allocas with bit-padding.
  if (SizeInBits != TD.getTypeStoreSizeInBits(AllocaTy))
@ -2198,7 +2201,7 @@ static bool isIntegerWideningViable(const DataLayout &TD,
      if (RelBegin == 0 && RelEnd == Size)
        WholeAllocaOp = true;
      if (IntegerType *ITy = dyn_cast<IntegerType>(LI->getType())) {
-        if (ITy->getBitWidth() < TD.getTypeStoreSize(ITy))
+        if (ITy->getBitWidth() < TD.getTypeStoreSizeInBits(ITy))
          return false;
        continue;
      }
@ -2214,7 +2217,7 @@ static bool isIntegerWideningViable(const DataLayout &TD,
      if (RelBegin == 0 && RelEnd == Size)
        WholeAllocaOp = true;
      if (IntegerType *ITy = dyn_cast<IntegerType>(ValueTy)) {
-        if (ITy->getBitWidth() < TD.getTypeStoreSize(ITy))
+        if (ITy->getBitWidth() < TD.getTypeStoreSizeInBits(ITy))
          return false;
        continue;
      }
--- a/test/CodeGen/Mips/biggot.ll
+++ b/test/CodeGen/Mips/biggot.ll
@ -0,0 +1,50 @@
+; RUN: llc -march=mipsel -mxgot < %s | FileCheck %s -check-prefix=O32
+; RUN: llc -march=mips64el -mcpu=mips64r2 -mattr=+n64 -mxgot < %s | \
+; RUN: FileCheck %s -check-prefix=N64
+
+@v0 = external global i32
+
+define void @foo1() nounwind {
+entry:
+; O32: lui $[[R0:[0-9]+]], %got_hi(v0)
+; O32: addu  $[[R1:[0-9]+]], $[[R0]], ${{[a-z0-9]+}}
+; O32: lw  ${{[0-9]+}}, %got_lo(v0)($[[R1]])
+; O32: lui $[[R2:[0-9]+]], %call_hi(foo0)
+; O32: addu  $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}}
+; O32: lw  ${{[0-9]+}}, %call_lo(foo0)($[[R3]])
+
+; N64: lui $[[R0:[0-9]+]], %got_hi(v0)
+; N64: daddu  $[[R1:[0-9]+]], $[[R0]], ${{[a-z0-9]+}}
+; N64: ld  ${{[0-9]+}}, %got_lo(v0)($[[R1]])
+; N64: lui $[[R2:[0-9]+]], %call_hi(foo0)
+; N64: daddu  $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}}
+; N64: ld  ${{[0-9]+}}, %call_lo(foo0)($[[R3]])
+
+  %0 = load i32* @v0, align 4
+  tail call void @foo0(i32 %0) nounwind
+  ret void
+}
+
+declare void @foo0(i32)
+
+; call to external function.
+
+define void @foo2(i32* nocapture %d, i32* nocapture %s, i32 %n) nounwind {
+entry:
+; O32: foo2:
+; O32: lui $[[R2:[0-9]+]], %call_hi(memcpy)
+; O32: addu  $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}}
+; O32: lw  ${{[0-9]+}}, %call_lo(memcpy)($[[R3]])
+
+; N64: foo2:
+; N64: lui $[[R2:[0-9]+]], %call_hi(memcpy)
+; N64: daddu  $[[R3:[0-9]+]], $[[R2]], ${{[a-z0-9]+}}
+; N64: ld  ${{[0-9]+}}, %call_lo(memcpy)($[[R3]])
+
+  %0 = bitcast i32* %d to i8*
+  %1 = bitcast i32* %s to i8*
+  tail call void @llvm.memcpy.p0i8.p0i8.i32(i8* %0, i8* %1, i32 %n, i32 4, i1 false)
+  ret void
+}
+
+declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture, i32, i32, i1) nounwind
--- a/test/MC/Mips/xgot.ll
+++ b/test/MC/Mips/xgot.ll
@ -0,0 +1,42 @@
+; RUN: llc -filetype=obj -mtriple mipsel-unknown-linux -mxgot %s -o - | elf-dump --dump-section-data  | FileCheck %s
+
+@.str = private unnamed_addr constant [16 x i8] c"ext_1=%d, i=%d\0A\00", align 1
+@ext_1 = external global i32
+
+define void @fill() nounwind {
+entry:
+
+; Check that the appropriate relocations were created. 
+; For the xgot case we want to see R_MIPS_[GOT|CALL]_[HI|LO]16.
+
+; R_MIPS_HI16
+; CHECK:     ('r_type', 0x05)
+
+; R_MIPS_LO16
+; CHECK:     ('r_type', 0x06)
+
+; R_MIPS_GOT_HI16
+; CHECK:     ('r_type', 0x16)
+
+; R_MIPS_GOT_LO16
+; CHECK:     ('r_type', 0x17)
+
+; R_MIPS_GOT
+; CHECK:     ('r_type', 0x09)
+
+; R_MIPS_LO16
+; CHECK:     ('r_type', 0x06)
+
+; R_MIPS_CALL_HI16
+; CHECK:     ('r_type', 0x1e)
+
+; R_MIPS_CALL_LO16
+; CHECK:     ('r_type', 0x1f)
+
+  %0 = load i32* @ext_1, align 4
+  %call = tail call i32 (i8*, ...)* @printf(i8* getelementptr inbounds ([16 x i8]* @.str, i32 0, i32 0), i32 %0) nounwind
+  ret void
+}
+
+declare i32 @printf(i8* nocapture, ...) nounwind
+
--- a/test/Transforms/SROA/basictest.ll
+++ b/test/Transforms/SROA/basictest.ll
@ -1134,3 +1134,45 @@ entry:
  ret void
 ; CHECK: ret
 }
+
+define void @PR14465() {
+; Ensure that we don't crash when analyzing a alloca larger than the maximum
+; integer type width (MAX_INT_BITS) supported by llvm (1048576*32 > (1<<23)-1).
+; CHECK: @PR14465
+
+  %stack = alloca [1048576 x i32], align 16
+; CHECK: alloca [1048576 x i32]
+  %cast = bitcast [1048576 x i32]* %stack to i8*
+  call void @llvm.memset.p0i8.i64(i8* %cast, i8 -2, i64 4194304, i32 16, i1 false)
+  ret void
+; CHECK: ret
+}
+
+define void @PR14548(i1 %x) {
+; Handle a mixture of i1 and i8 loads and stores to allocas. This particular
+; pattern caused crashes and invalid output in the PR, and its nature will
+; trigger a mixture in several permutations as we resolve each alloca
+; iteratively.
+; Note that we don't do a particularly good *job* of handling these mixtures,
+; but the hope is that this is very rare.
+; CHECK: @PR14548
+
+entry:
+  %a = alloca <{ i1 }>, align 8
+  %b = alloca <{ i1 }>, align 8
+; Nothing of interest is simplified here.
+; CHECK: alloca
+; CHECK: alloca
+
+  %b.i1 = bitcast <{ i1 }>* %b to i1*
+  store i1 %x, i1* %b.i1, align 8
+  %b.i8 = bitcast <{ i1 }>* %b to i8*
+  %foo = load i8* %b.i8, align 1
+
+  %a.i8 = bitcast <{ i1 }>* %a to i8*
+  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %a.i8, i8* %b.i8, i32 1, i32 1, i1 false) nounwind
+  %bar = load i8* %a.i8, align 1
+  %a.i1 = getelementptr inbounds <{ i1 }>* %a, i32 0, i32 0
+  %baz = load i1* %a.i1, align 1
+  ret void
+}
--- a/test/Transforms/SROA/big-endian.ll
+++ b/test/Transforms/SROA/big-endian.ll
@ -82,14 +82,9 @@ entry:

  %a0i16ptr = bitcast i8* %a0ptr to i16*
  store i16 1, i16* %a0i16ptr
-; CHECK:      %[[mask0:.*]] = and i16 1, -16
-
-  %a1i4ptr = bitcast i8* %a1ptr to i4*
-  store i4 1, i4* %a1i4ptr
-; CHECK-NEXT: %[[insert0:.*]] = or i16 %[[mask0]], 1

  store i8 1, i8* %a2ptr
-; CHECK-NEXT: %[[mask1:.*]] = and i40 undef, 4294967295
+; CHECK:      %[[mask1:.*]] = and i40 undef, 4294967295
 ; CHECK-NEXT: %[[insert1:.*]] = or i40 %[[mask1]], 4294967296

  %a3i24ptr = bitcast i8* %a3ptr to i24*
@ -110,7 +105,7 @@ entry:
  %ai = load i56* %aiptr
  %ret = zext i56 %ai to i64
  ret i64 %ret
-; CHECK-NEXT: %[[ext4:.*]] = zext i16 %[[insert0]] to i56
+; CHECK-NEXT: %[[ext4:.*]] = zext i16 1 to i56
 ; CHECK-NEXT: %[[shift4:.*]] = shl i56 %[[ext4]], 40
 ; CHECK-NEXT: %[[mask4:.*]] = and i56 %[[insert3]], 1099511627775
 ; CHECK-NEXT: %[[insert4:.*]] = or i56 %[[mask4]], %[[shift4]]