Add ISD::EH_DWARF_CFA, simplify @llvm.eh.dwarf.cfa on Mips, fix on
PowerPC
LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the
GCC-compatible __builtin_dwarf_cfa() builtin. As pointed out in
PR26761, this is currently broken on PowerPC (and likely on ARM as
well). Currently, @llvm.eh.dwarf.cfa is lowered using:
ADD(FRAMEADDR, FRAME_TO_ARGS_OFFSET)
where FRAME_TO_ARGS_OFFSET defaults to the constant zero. On x86,
FRAME_TO_ARGS_OFFSET is lowered to 2*SlotSize. This setup, however,
does not work for PowerPC. Because of the way that the stack layout
works, the canonical frame address is not exactly (FRAMEADDR +
FRAME_TO_ARGS_OFFSET) on PowerPC (there is a lower save-area offset
as well), so it is not just a matter of implementing
FRAME_TO_ARGS_OFFSET for PowerPC (unless we redefine its semantics --
We can do that, since it is currently used only for
@llvm.eh.dwarf.cfa lowering, but the better to directly lower the CFA
construct itself (since it can be easily represented as a
fixed-offset FrameIndex)). Mips currently does this, but by using a
custom lowering for ADD that specifically recognizes the (FRAMEADDR,
FRAME_TO_ARGS_OFFSET) pattern.
This change introduces a ISD::EH_DWARF_CFA node, which by default
expands using the existing logic, but can be directly lowered by the
target. Mips is updated to use this method (which simplifies its
implementation, and I suspect makes it more robust), and updates
PowerPC to do the same.
Fixes PR26761.
Differential Revision: https://reviews.llvm.org/D24038
[DwarfDebug] Move MergeValues to .cpp, NFC
Pull in r257979 from upstream llvm trunk, by Keno Fischer:
[DwarfDebug] Don't merge DebugLocEntries if their pieces overlap
Summary:
Later in DWARF emission we check that DebugLocEntries have
non-overlapping pieces, so we should create any such entries
by merging here.
Fixes PR26163.
Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D16249
Again, these will be merged to the official release_38 branch soon, but
we need them ASAP.
be merged to the official release_38 branch soon, but we need it ASAP):
Stop increasing alignment of externally-visible globals on ELF
platforms.
With ELF, the alignment of a global variable in a shared library will
get copied into an executables linked against it, if the executable even
accesss the variable. So, it's not possible to implicitly increase
alignment based on access patterns, or you'll break existing binaries.
This happened to affect libc++'s std::cout symbol, for example. See
thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311
(This is a re-commit of r257719, without the bug reported in
PR26144. I've tweaked the code to not assert-fail in
enforceKnownAlignment when computeKnownBits doesn't recurse far enough
to find the underlying Alloca/GlobalObject value.)
Differential Revision: http://reviews.llvm.org/D16145
from upstream clang trunk, which sets the default debug tuning back to
gdb. The lldb debug tuning is not yet grokked completely by our ELF
manipulation tools.
bugfix-only release, with no new features.
Please note that from 3.5.0 onwards, clang and llvm require C++11
support to build; see UPDATING for more information.
ARM: treat [N x i32] and [N x i64] as AAPCS composite types
The logic is almost there already, with our special homogeneous
aggregate handling. Tweaking it like this allows front-ends to emit
AAPCS compliant code without ever having to count registers or add
discarded padding arguments.
Only arrays of i32 and i64 are needed to model AAPCS rules, but I
decided to apply the logic to all integer arrays for more consistency.
This fixes a possible "Unexpected member type for HA" error when
compiling lib/msun/bsdsrc/b_tgamma.c for armv6.
Reported by: Jakub Palider <jpa@semihalf.com>
[X86] Convert esp-relative movs of function arguments to pushes, step 2
This moves the transformation introduced in r223757 into a separate MI pass.
This allows it to cover many more cases (not only cases where there must be a
reserved call frame), and perform rudimentary call folding. It still doesn't
have a heuristic, so it is enabled only for optsize/minsize, with stack
alignment <= 8, where it ought to be a fairly clear win.
(Re-commit of r227728)
Differential Revision: http://reviews.llvm.org/D6789
This helps to get sys/boot/i386/boot2 below the required size again,
when optimizing with -Oz.
which gcc in base cannot handle. Replace these with C++98 equivalents.
While here, add the patch for the adapted fix.
Reported by: bz, kib
Pointy hat to: dim
MFC after: 1 week
X-MFC-With: r274442
Totally forget deallocated SDNodes in SDDbgInfo.
What would happen before that commit is that the SDDbgValues associated with
a deallocated SDNode would be marked Invalidated, but SDDbgInfo would keep
a map entry keyed by the SDNode pointer pointing to this list of invalidated
SDDbgNodes. As the memory gets reused, the list might get wrongly associated
with another new SDNode. As the SDDbgValues are cloned when they are transfered,
this can lead to an exponential number of SDDbgValues being produced during
DAGCombine like in http://llvm.org/bugs/show_bug.cgi?id=20893
Note that the previous behavior wasn't really buggy as the invalidation made
sure that the SDDbgValues won't be used. This commit can be considered a
memory optimization and as such is really hard to validate in a unit-test.
This should fix abnormally large memory usage and resulting OOM crashes
when compiling certain ports with debug information.
Reported by: Dmitry Marakasov <amdmi3@amdmi3.ru>
Upstream PRs: http://llvm.org/PR19031http://llvm.org/PR20893
MFC after: 1 week
Legalizer: Add support for splitting insert_subvectors.
We handle this by spilling the whole thing to the stack and doing the
insertion as a store.
PR19492. This happens in real code because the vectorizer creates
v2i128 when AVX is enabled.
This fixes a "fatal error: error in backend: Do not know how to split
the result of this operator!" message encountered during compilation of
the net-p2p/libtorrent-rasterbar port.
Reported by: Evgeniy <iron@mail.ua>
MFC after: 3 days
Debug info: Support variadic functions.
Variadic functions have an unspecified parameter tag after the last
argument. In IR this is represented as an unspecified parameter in the
subroutine type.
Paired commit with CFE r202185.
rdar://problem/13690847
This re-applies r202184 + a bugfix in DwarfDebug's argument handling.
This merge includes a change to use the LLVM 3.4 API in
lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp:
DwarfUnit -> CompileUnit
Sponsored by: DARPA, AFRL
ISel: Make VSELECT selection terminate in cases where the condition type has to
be split and the result type widened.
When the condition of a vselect has to be split it makes no sense widening the
vselect and thereby widening the condition. We end up in an endless loop of
widening (vselect result type) and splitting (condition mask type) doing this.
Instead, split both the condition and the vselect and widen the result.
I ran this over the test suite with i686 and mattr=+sse and saw no regressions.
Fixes PR18036.
With this fix the original problem case from the graphics/rawtherapee
port (posted in http://llvm.org/PR18036 ) now compiles within ~97MB RSS.
Reported by: mandree
MFC after: 1 week
Reland "Fix miscompile of MS inline assembly with stack realignment"
This re-lands commit r196876, which was reverted in r196879.
The tests have been fixed to pass on platforms with a stack alignment
larger than 4.
Update to clang side tests will land shortly.
Pull in r196986 from upstream llvm trunk (by Reid Kleckner):
Revert the backend fatal error from r196939
The combination of inline asm, stack realignment, and dynamic allocas
turns out to be too common to reject out of hand.
ASan inserts empy inline asm fragments and uses aligned allocas.
Compiling any trivial function containing a dynamic alloca with ASan is
enough to trigger the check.
XFAIL the test cases that would be miscompiled and add one that uses the
relevant functionality.
Pull in r202930 from upstream llvm trunk (by Hans Wennborg):
Check for dynamic allocas and inline asm that clobbers sp before building
selection dag (PR19012)
In X86SelectionDagInfo::EmitTargetCodeForMemcpy we check with MachineFrameInfo
to make sure that ESI isn't used as a base pointer register before we choose to
emit rep movs (which clobbers esi).
The problem is that MachineFrameInfo wouldn't know about dynamic allocas or
inline asm that clobbers the stack pointer until SelectionDAGBuilder has
encountered them.
This patch fixes the problem by checking for such things when building the
FunctionLoweringInfo.
Differential Revision: http://llvm-reviews.chandlerc.com/D2954
Together, these commits fix the problem encountered in the devel/emacs
port on the i386 architecture, where a combination of stack realignment,
alloca() and memcpy() could incidentally clobber the %esi register,
leading to segfaults in the temacs build-time utility.
See also: http://llvm.org/PR18171 and http://llvm.org/PR19012
Reported by: ashish
PR: ports/183064
MFC after: 1 week
all of the features in the current working draft of the upcoming C++
standard, provisionally named C++1y.
The code generator's performance is greatly increased, and the loop
auto-vectorizer is now enabled at -Os and -O2 in addition to -O3. The
PowerPC backend has made several major improvements to code generation
quality and compile time, and the X86, SPARC, ARM32, Aarch64 and SystemZ
backends have all seen major feature work.
Release notes for llvm and clang can be found here:
<http://llvm.org/releases/3.4/docs/ReleaseNotes.html>
<http://llvm.org/releases/3.4/tools/clang/docs/ReleaseNotes.html>
MFC after: 1 month
Remove invalid assert in DAGTypeLegalizer::RemapValue
There is a comment at the top of DAGTypeLegalizer::PerformExpensiveChecks
which, in part, says:
// Note that these invariants may not hold momentarily when processing a node:
// the node being processed may be put in a map before being marked Processed.
Unfortunately, this assert would be valid only if the above-mentioned invariant
held unconditionally. This was causing llc to assert when, in fact,
everything was fine.
Thanks to Richard Sandiford for investigating this issue!
Fixes PR16562.
This fixes assertions which could occur in the multimedia/ffmpeg1 and
multimedia/ffmpeg2 ports.
Approved by: re (hrs)
Reported by: Matthias Apitz <guru@unixarea.de>
MFC after: 3 days
ISelDAG: spot chain cycles involving MachineNodes
Previously, the DAGISel function WalkChainUsers was spotting that it
had entered already-selected territory by whether a node was a
MachineNode (amongst other things). Since it's fairly common practice
to insert MachineNodes during ISelLowering, this was not the correct
check.
Looking around, it seems that other nodes get their NodeId set to -1
upon selection, so this makes sure the same thing happens to all
MachineNodes and uses that characteristic to determine whether we
should stop looking for a loop during selection.
This should fix PR15840.
Specifically, this fixes the long-standing assertion failure when
compiling the multimedia/gstreamer port on i386. Thanks to Tijl
Coosemans for his help in getting upstream to fix it.
Approved by: re (marius)
FastISel can only append to basic blocks.
Compute the insertion point from the end of the basic block instead of
skipping labels from the front.
This caused failures in landing pads when live-in copies where inserted
before instruction selection.
I missed this change in r252720; without it, certain compilation flags
can cause exception labels to not be generated, but still referenced,
leading to link errors.
Reported by: zeising
MFC after: 3 days
Add MachineBasicBlock::addLiveIn().
This function adds a live-in physical register to an MBB and ensures
that it is copied to a virtual register immediately.
Pull in r185615 from llvm trunk:
Live-in copies go *after* EH_LABELs.
This will soon be tested by exception handling working at all.
Pull in r185617 from llvm trunk:
Simplify landing pad lowering.
Stop using the ISD::EXCEPTIONADDR and ISD::EHSELECTION when lowering
landing pad arguments. These nodes were previously legalized into
CopyFromReg nodes, but that never worked properly because the
CopyFromReg node weren't guaranteed to be scheduled at the top of the
basic block.
This meant the exception pointer and selector registers could be
clobbered before being copied to a virtual register.
This patch copies the two physical registers to virtual registers at
the beginning of the basic block, and lowers the landingpad instruction
directly to two CopyFromReg nodes reading the *virtual* registers. This
is safe because virtual registers don't get clobbered.
A future patch will remove the ISD::EXCEPTIONADDR and ISD::EHSELECTION
nodes.
Together, these changes fix llvm PR 16038 ('qt4 webcore file results in
"Bad machine code: Using an undefined physical register"'), and should
make it possible again to compile the www/qt4-webkit port again on the
i386 arch, without using a CPUTYPE=i686 or higher setting.
Make PrologEpilogInserter save/restore all callee saved registers in
functions which call __builtin_unwind_init()
__builtin_unwind_init() is an undocumented gcc intrinsic which has
this effect, and is used in libgcc_eh.
Goes part of the way toward fixing PR8541.
This obsoletes the ugly hack to libgcc's unwind code from r245272, and
should also work for other arches, so revert the hack too.
PR15662: Optimized debug info produces out of order function
parameters
When a function is inlined we lazily construct the variables
representing the function's parameters. After that, we add any
remaining unused parameters.
If the function doesn't use all the parameters, or uses them out of
order, then the DWARF would produce them in that order, producing a
parameter order that doesn't match the source.
This fix causes us to always keep the arg variables at the start of
the variable list & in the original order from the source.
Reported by: avg
MFC after: 1 week
upcoming 3.3 release (branching and freezing expected in a few weeks).
Preliminary release notes can be found at the usual location:
<http://llvm.org/docs/ReleaseNotes.html>
An MFC is planned once the actual 3.3 release is finished.
When creating MCAsmBackend pass the CPU string as well. In X86AsmBackend
store this and use it to not emit long nops when the CPU is geode which
doesnt support them.
Fixes PR11212.
Pull in r164133 from upstream clang trunk:
Follow up on llvm r164132.
This should prevent illegal instructions when building world on Geode
CPUs (e.g. Soekris).
MFC after: 3 days