X86: Don't fold spills into SSE operations if the stack is unaligned.
Regalloc can emit unaligned spills nowadays, but we can't fold the
spills into SSE ops if we can't guarantee alignment. PR12250.
This fixes unaligned SSE accesses (leading to a SIGBUS) which could
occur in the ffmpeg ports.
Approved by: re (kib)
Reported by: tijl
MFC after: 3 days
Add ms_abi and sysv_abi attribute handling.
Based on a patch by Benno Rice!
This will help to develop EFI support.
Approved by: re (kib)
Verified by: benno
MFC after: 1 week
Remove invalid assert in DAGTypeLegalizer::RemapValue
There is a comment at the top of DAGTypeLegalizer::PerformExpensiveChecks
which, in part, says:
// Note that these invariants may not hold momentarily when processing a node:
// the node being processed may be put in a map before being marked Processed.
Unfortunately, this assert would be valid only if the above-mentioned invariant
held unconditionally. This was causing llc to assert when, in fact,
everything was fine.
Thanks to Richard Sandiford for investigating this issue!
Fixes PR16562.
This fixes assertions which could occur in the multimedia/ffmpeg1 and
multimedia/ffmpeg2 ports.
Approved by: re (hrs)
Reported by: Matthias Apitz <guru@unixarea.de>
MFC after: 3 days
The X86FixupLEAs pass for Intel Atom must not call
convertToThreeAddress on ADD16rr opcodes, if src1 != src, since that
would cause convertToThreeAddress to try to create a virtual register.
This is not permitted after register allocation, which is when the
X86FixupLEAs pass runs.
This patch fixes PR16785.
Pull in r191715 from upstream llvm trunk:
Forgot to add a break statement.
This should enable building the x11-toolskits/libXaw port with
CPUTYPE=atom.
Approved by: re (gjb)
Reported by: Kenta Suzumoto <kentas@hush.com>
MFC after: 3 days
ISelDAG: spot chain cycles involving MachineNodes
Previously, the DAGISel function WalkChainUsers was spotting that it
had entered already-selected territory by whether a node was a
MachineNode (amongst other things). Since it's fairly common practice
to insert MachineNodes during ISelLowering, this was not the correct
check.
Looking around, it seems that other nodes get their NodeId set to -1
upon selection, so this makes sure the same thing happens to all
MachineNodes and uses that characteristic to determine whether we
should stop looking for a loop during selection.
This should fix PR15840.
Specifically, this fixes the long-standing assertion failure when
compiling the multimedia/gstreamer port on i386. Thanks to Tijl
Coosemans for his help in getting upstream to fix it.
Approved by: re (marius)
To enable them, set WITH_GCC and WITH_GNUCXX in src.conf.
Make clang default to using libc++ on FreeBSD 10.
Bumped __FreeBSD_version for the change.
GCC is still enabled on PC98, because the PC98 bootloader requires GCC to build
(or, at least, hard-codes the use of gcc into its build).
Thanks to everyone who helped make the ports tree ready for this (and bapt
for coordinating them all). Also to imp for reviewing this and working on the
forward-porting of the changes in our gcc so that we're getting to a much
better place with regard to external toolchains.
Sorry to all of the people who helped who I forgot to mention by name.
Reviewed by: bapt, imp, dim, ...
InstCombine: Check for zero shift amounts before subtracting one
causing integer overflow.
PR17026. Also avoid undefined shifts and shift amounts larger than 64
bits (those are always undef because we can't represent integer types
that large).
This should fix assertion failures when building the emulators/xmame
port.
Reported by: bapt
NoBuiltin was introduced after clang/llvm 3.3 and thus does not exist in
FreeBSD. Thus special handling for the attribute is not needed in lldb.
This reverts lldb r186990 (git eebd175)
Sponsored by: DARPA, AFRL
Author: Daniel Malea <daniel.malea@intel.com>
Date: Thu Aug 1 21:18:16 2013 +0000
Fixed the Intel-syntax X86 disassembler to respect the (existing)
option for hexadecimal immediates, to match AT&T syntax. This also
brings a new option for C-vs-MASM-style hex.
Patch by Richard Mitton
Reviewed: http://llvm-reviews.chandlerc.com/D1243
Fix handling of braced-init-list as reference initializer within
aggregate initialization. Previously we would incorrectly require an
extra set of braces around such initializers.
Pull in r188718 from upstream clang trunk:
Handle init lists and _Atomic fields.
Fixes PR16931.
These fixes are needed for the atomic_flag type to work correctly in our
stdatomic.h.
Requested by: theraven
PR16727: don't try to evaluate a potentially value-dependent
expression when checking for missing parens in &&/|| expressions.
This fixes an assertion encountered when building the lang/sdcc port.
Reported by: kwm
This patch implements __get_cpuid_max() as an inline and __cpuid()
and __cpuid_count() as macros to be compatible with GCC's cpuid.h.
It also adds bit_<foo> constants for the various feature bits as
described in version 039 (May 2011) of Intel's SDM Volume 2 in the
description of the CPUID instruction. The list of bit_<foo>
constants is a bit exhaustive (GCC doesn't do near this many). More
bits could be added from a newer version of SDM if desired.
Patch by John Baldwin!
This should fix several ports which depend on this functionality being
available.
MFC after: 1 week
FastISel can only append to basic blocks.
Compute the insertion point from the end of the basic block instead of
skipping labels from the front.
This caused failures in landing pads when live-in copies where inserted
before instruction selection.
I missed this change in r252720; without it, certain compilation flags
can cause exception labels to not be generated, but still referenced,
leading to link errors.
Reported by: zeising
MFC after: 3 days
Add MachineBasicBlock::addLiveIn().
This function adds a live-in physical register to an MBB and ensures
that it is copied to a virtual register immediately.
Pull in r185615 from llvm trunk:
Live-in copies go *after* EH_LABELs.
This will soon be tested by exception handling working at all.
Pull in r185617 from llvm trunk:
Simplify landing pad lowering.
Stop using the ISD::EXCEPTIONADDR and ISD::EHSELECTION when lowering
landing pad arguments. These nodes were previously legalized into
CopyFromReg nodes, but that never worked properly because the
CopyFromReg node weren't guaranteed to be scheduled at the top of the
basic block.
This meant the exception pointer and selector registers could be
clobbered before being copied to a virtual register.
This patch copies the two physical registers to virtual registers at
the beginning of the basic block, and lowers the landingpad instruction
directly to two CopyFromReg nodes reading the *virtual* registers. This
is safe because virtual registers don't get clobbered.
A future patch will remove the ISD::EXCEPTIONADDR and ISD::EHSELECTION
nodes.
Together, these changes fix llvm PR 16038 ('qt4 webcore file results in
"Bad machine code: Using an undefined physical register"'), and should
make it possible again to compile the www/qt4-webkit port again on the
i386 arch, without using a CPUTYPE=i686 or higher setting.
the stack in a leaf function that uses TLS.
The issue is, when using TLS, the function is no longer a leaf as it calls
__aeabi_read_tp. With statically linked programs this is not an issue as
it doesn't make use of the stack, however with dynamically linked
applications we enter rtld which does use the stack and makes assumptions
about it's alignment.
This is only a temporary fix until a better patch can be made and submitted
upstream.
Make PrologEpilogInserter save/restore all callee saved registers in
functions which call __builtin_unwind_init()
__builtin_unwind_init() is an undocumented gcc intrinsic which has
this effect, and is used in libgcc_eh.
Goes part of the way toward fixing PR8541.
This obsoletes the ugly hack to libgcc's unwind code from r245272, and
should also work for other arches, so revert the hack too.
Allow clang to build __clear_cache on ARM.
__clear_cache is special. It needs no signature, but is a real function in
compiler_rt or libgcc.
Patch by Andrew Turner.
This allows us to build the __clear_cache function in compiler-rt.
Emit native implementations of atomic operations on FreeBSD/armv6.
Just like on Linux, FreeBSD/armv6 assumes the system supports
ldrex/strex unconditionally. It is also used by the kernel. We can
therefore enable support for it, like we do on Linux.
While there, change one of the unit tests to explicitly test against
armv5 instead of armv7, as it actually tests whether libcalls are
emitted.
[ms-inline asm] Fix a crasher when we fail on a direct match.
The issue was that the MatchingInlineAsm and VariantID args to the
MatchInstructionImpl function weren't being set properly. Specifically, when
parsing intel syntax, the parser thought it was parsing inline assembly in the
at&t dialect; that will never be the case.
The crash was caused when the emitter tried to emit the instruction, but the
operands weren't set. When parsing inline assembly we only set the opcode, not
the operands, which is used to lookup the instruction descriptor.
rdar://13854391 and PR15945
Also, this commit reverts r176036. Now that we're correctly parsing the intel
syntax the pushad/popad don't match properly. I've reimplemented that fix using
a MnemonicAlias.
Pull in r183907 from llvm trunk:
X86: Make the cmov aliases work with intel syntax too.
These commits make a number of Intel-style inline assembly mnemonics
aliases (occurring in several ports) work properly, which could cause
assertions otherwise.
Reported by: kwm, bapt
PR15662: Optimized debug info produces out of order function
parameters
When a function is inlined we lazily construct the variables
representing the function's parameters. After that, we add any
remaining unused parameters.
If the function doesn't use all the parameters, or uses them out of
order, then the DWARF would produce them in that order, producing a
parameter order that doesn't match the source.
This fix causes us to always keep the arg variables at the start of
the variable list & in the original order from the source.
Reported by: avg
MFC after: 1 week
Add support for optimized (non-generic) atomic libcalls.
For integer types of sizes 1, 2, 4 and 8, libcompiler-rt (and libgcc)
provide atomic functions that pass parameters by value and return
results directly.
libgcc and libcompiler-rt only provide optimized libcalls for
__atomic_fetch_*, as generic libcalls on non-integer types would make
little sense. This means that we can finally make __atomic_fetch_*
work
on architectures for which we don't provide these operations as
builtins
(e.g. ARM).
This should fix the dreaded "cannot compile this atomic library call
yet" error that would pop up once every while.
This should make it possible for me to get C11 atomics working on all of
our platforms.
LoopVectorize: LoopSimplify can't canonicalize loops with an
indirectbr in it, don't assert on those cases.
Fixes PR16139.
This should fix clang assertion failures when optimizing at -O3, similar
to:
Assertion failed: (TheLoop->getLoopPreheader() && "No preheader!!"),
function canVectorize, file
contrib/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp, line 2171.
Reported by: O. Hartmann <ohartman@zedat.fu-berlin.de>
PR: ports/178332, ports/178977
MFC after: 3 days
LoopVectorize: getConsecutiveVector must respect signed arithmetic
We were passing an i32 to ConstantInt::get where an i64 was needed and we must
also pass the sign if we pass negatives numbers. The start index passed to
getConsecutiveVector must also be signed.
Should fix PR15882.
This should fix Firefox crashes some people have been reporting, when it
is compiled with -O3.
LoopVectorizer: Fix 15830. When scalarizing and unrolling stores make
sure that the order in which the elements are scalarized is the same
as the original order.
This fixes a miscompilation in FreeBSD's regex library.
This should fix lib/libc/regex/regcomp.c at -O3 with clang 3.3 r178860
on CPUs with SSE. Before this change, the vectorizer could incorrectly
rearrange the second loop in computejumps(), leading to possibly invalid
entries in the re_gets::charjump table.
The net result was that for example "sed s/@CC@/foo/" failed to work
correctly, leading to trouble with many configure scripts.
upcoming 3.3 release (branching and freezing expected in a few weeks).
Preliminary release notes can be found at the usual location:
<http://llvm.org/docs/ReleaseNotes.html>
An MFC is planned once the actual 3.3 release is finished.
X86: Disable cmov-memory patterns on subtargets without cmov.
Fixes PR15115.
For the i386 arch, this should enable cmov instructions only on
-march=pentiumpro and higher. Since our default CPU is i486, cmov
instructions will now be disabled by default.
MFC after: 1 week
Refactor the x86 CPU name logic in the driver and pass -march and -mcpu
flag information down from the Clang driver into the Gold linker plugin
for LTO. This allows specifying -march on the linker commandline and
should hopefully have it pass all the way through to the LTO optimizer.
Fixes PR14697.
Pull in r175919 from upstream clang trunk:
Driver: Pass down the -march setting down to -cc1as on x86 too.
The assembler historically didn't make use of any target features, but this has
changed when support for old CPUs that don't support long nops was added.
This should fix the long nops that still occurred in crt*.o, and
possibly other object files, if the system was compiled for a CPU that
does not support those, such as Geode.
Note that gcc on i386 also does not pass through any -march, -mcpu or
-mtune setting to gas, but this has not caused any trouble yet, because
gas defaults to i386.
Reported by: lev
MFC after: 1 week
MCParser: Reject .balign with non-pow2 alignments.
GNU as rejects them and there are configure scripts in the wild that
check if the assembler rejects ".align 3" to determine whether the
alignment is in bytes or powers of two.
MFC after: 3 days
X86: Disable generation of rep;movsl when %esi is used as a base pointer.
This happens when there is both stack realignment and a dynamic alloca in the
function. If we overwrite %esi (rep;movsl uses fixed registers) we'll lose the
base pointer and the next register spill will write into oblivion.
Fixes PR15249 and unbreaks firefox on i386/freebsd. Mozilla uses dynamic allocas
and freebsd a 4 byte stack alignment.
MFC after: 1 week
Dont use/link ARCMT, StaticAnalyzer and Rewriter to clang when the user
specifies not to. Dont build ASTMatchers with Rewriter disabled and
StaticAnalyzer when it's disabled.
Without all those three, the clang binary shrinks (x86_64) from ~36MB
to ~32MB (unstripped).
To disable these clang components, and get a smaller clang binary built
and installed, set WITHOUT_CLANG_FULL in src.conf(5). During the
initial stages of buildworld, those extra components are already
disabled automatically, to save some build time.
MFC after: 1 week
Fix another SROA crasher, PR14601.
This was a silly oversight, we weren't pruning allocas which were used
by variable-length memory intrinsics from the set that could be widened
and promoted as integers. Fix that.
This should fix the following assertion failure:
Assertion failed: (CanSROA), function visitUsers, file
/usr/src/lib/clang/libllvmscalaropts/../../../contrib/llvm/lib/Transforms/Scalar/SROA.cpp,
line 2395.
Reported by: gerald
X86: fcmov doesn't handle all possible EFLAGS, fall back to a branch
for the others.
Otherwise it will try to use SSE patterns and fail horribly if sse is
disabled.
Fixes PR14035.
This should fix the following assertion failure:
Assertion failed: (Reg >= X86::FP0 && Reg <= X86::FP6 && "Expected FP
register!"), function getFPReg, file
contrib/llvm/lib/Target/X86/X86FloatingPoint.cpp, line 330.
which can show up when compiling contrib/compiler-rt, using -march=i686
through -march=pentium3 (CPU's which do support fcmov, but don't support
SSE2).
MFC after: 1 week
Add a new warning -Wmissing-variable-declarations, to warn about variables
defined without a previous declaration. This is similar to
-Wmissing-prototypes, but for variables instead of functions.
Make sure always-inline functions get inlined. <rdar://problem/12423986>
Without this change, when the estimated cost for inlining a function with
an "alwaysinline" attribute was lower than the inlining threshold, the
getInlineCost function was returning that estimated cost rather than the
special InlineCost::AlwaysInlineCost value. That is fine in the normal
inlining case, but it can fail when the inliner considers the opportunity
cost of inlining into an internal or linkonce-odr function. It may decide
not to inline the always-inline function in that case. The fix here is just
to make getInlineCost always return the special value for always-inline
functions. I ran into this building clang with libc++. Tablegen failed to
link because of an always-inline function that was not inlined. I have been
unable to reduce the testcase down to a reasonable size.
This should fix the link errors that were reported when atf-run was
compiled with clang -stdlib=libc++. In this case, at -O3 optimization,
some calls to basic_ios::clear() were not inlined, even when the
function was marked __always_inline__.
Reported by: Jan Beich <jbeich@tormail.org>
MFC after: 1 week
X86: Disable long nops for all cpus prior to pentiumpro/i686.
This is the safest approach for now. If you think long nops matter a
lot for performance, compile with -march=i686 or higher. :)
MFC after: 3 days
When creating MCAsmBackend pass the CPU string as well. In X86AsmBackend
store this and use it to not emit long nops when the CPU is geode which
doesnt support them.
Fixes PR11212.
Pull in r164133 from upstream clang trunk:
Follow up on llvm r164132.
This should prevent illegal instructions when building world on Geode
CPUs (e.g. Soekris).
MFC after: 3 days