[VectorUtils] Fix nasty use-after-free
In truncateToMinimalBitwidths() we were RAUW'ing an instruction then
erasing it. However, that intruction could be cached in the map we're
iterating over. The first check is "I->use_empty()" which in most
cases would return true, as the (deleted) object was RAUW'd first so
would have zero use count. However in some cases the object could
have been polluted or written over and this wouldn't be the case.
Also it makes valgrind, asan and traditionalists who don't like their
compiler to crash sad.
No testcase as there are no externally visible symptoms apart from a
crash if the stars align.
Fixes PR26509.
This should fix crashes when building a number of ports on arm64.
Reported by: andrew
be merged to the official release_38 branch soon, but we need it ASAP):
Stop increasing alignment of externally-visible globals on ELF
platforms.
With ELF, the alignment of a global variable in a shared library will
get copied into an executables linked against it, if the executable even
accesss the variable. So, it's not possible to implicitly increase
alignment based on access patterns, or you'll break existing binaries.
This happened to affect libc++'s std::cout symbol, for example. See
thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311
(This is a re-commit of r257719, without the bug reported in
PR26144. I've tweaked the code to not assert-fail in
enforceKnownAlignment when computeKnownBits doesn't recurse far enough
to find the underlying Alloca/GlobalObject value.)
Differential Revision: http://reviews.llvm.org/D16145
bugfix-only release, with no new features.
Please note that from 3.5.0 onwards, clang and llvm require C++11
support to build; see UPDATING for more information.
[SLP] Vectorize for all-constant entries.
This should fix libc++'s iostream initialization SIGBUSing on amd64,
whenever the global cout symbol is not aligned to 16 bytes.
Some further explanation: libc++'s iostream.cpp contains the definitions
of std::cout, std::cerr and so on. These global objects are effectively
declared with an alignment of 8 bytes. When an executable is linked
against libc++.so, it can sometimes get a copy of the global object,
which is then at the same alignment.
However, with clang 3.7.0, the initialization of these global objects
will incorrectly use SSE instructions (e.g. movdqa), whenever the
optimization level is high enough, and SSE is enabled, such as on amd64.
When any of these objects is not aligned to 16 bytes, this will result
in a SIGBUS during iostream initialization. In contrast, clang 3.6.x
and earlier took the 8 byte alignment into consideration, and avoided
SSE for those particular operations.
After bisecting of upstream changes, I found that the above revision
caused the change of this behavior, so I am reverting it now as a
workaround, while a discussion and test case is being prepared for
upstream.
[SCCP] Turn loads of null into undef instead of zero initialized values
Surprisingly, this is a correctness issue: the mmx type exists for
calling convention purposes, LLVM doesn't have a zero representation for
them.
This partially fixes PR23999.
Pull in r241143 from upstream llvm trunk (by David Majnemer):
[LoopUnroll] Use undef for phis with no value live
We would create a phi node with a zero initialized operand instead of
undef in the case where no value was originally available. This was
problematic for x86_mmx which has no null value.
These fix a "Cannot create a null constant of that type!" error when
compiling the graphics/sdl2_gfx port with MMX enabled.
Reported by: amdmi3
LoopRotate: When reconstructing loop simplify form don't split edges
from indirectbrs.
Yet another chapter in the endless story. While this looks like we
leave the loop in a non-canonical state this replicates the logic in
LoopSimplify so it doesn't diverge from the canonical form in any way.
http://llvm.org/PR21968
This fixes a "Cannot split critical edge from IndirectBrInst" assertion
failure when building the devel/radare2 port.
PR: 195480, 196987
MFC after: 3 days
Revert "Added inst combine transforms for single bit tests from Chris's note"
This reverts commit r210006, it miscompiled libapr which is used in who
knows how many projects.
A test has been added to ensure that we don't regress again.
This fixes a miscompilation in libapr, which caused problems in svnlite.
all of the features in the current working draft of the upcoming C++
standard, provisionally named C++1y.
The code generator's performance is greatly increased, and the loop
auto-vectorizer is now enabled at -Os and -O2 in addition to -O3. The
PowerPC backend has made several major improvements to code generation
quality and compile time, and the X86, SPARC, ARM32, Aarch64 and SystemZ
backends have all seen major feature work.
Release notes for llvm and clang can be found here:
<http://llvm.org/releases/3.4/docs/ReleaseNotes.html>
<http://llvm.org/releases/3.4/tools/clang/docs/ReleaseNotes.html>
MFC after: 1 month
InstCombine: Check for zero shift amounts before subtracting one
causing integer overflow.
PR17026. Also avoid undefined shifts and shift amounts larger than 64
bits (those are always undef because we can't represent integer types
that large).
This should fix assertion failures when building the emulators/xmame
port.
Reported by: bapt
LoopVectorize: LoopSimplify can't canonicalize loops with an
indirectbr in it, don't assert on those cases.
Fixes PR16139.
This should fix clang assertion failures when optimizing at -O3, similar
to:
Assertion failed: (TheLoop->getLoopPreheader() && "No preheader!!"),
function canVectorize, file
contrib/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp, line 2171.
Reported by: O. Hartmann <ohartman@zedat.fu-berlin.de>
PR: ports/178332, ports/178977
MFC after: 3 days
LoopVectorize: getConsecutiveVector must respect signed arithmetic
We were passing an i32 to ConstantInt::get where an i64 was needed and we must
also pass the sign if we pass negatives numbers. The start index passed to
getConsecutiveVector must also be signed.
Should fix PR15882.
This should fix Firefox crashes some people have been reporting, when it
is compiled with -O3.
LoopVectorizer: Fix 15830. When scalarizing and unrolling stores make
sure that the order in which the elements are scalarized is the same
as the original order.
This fixes a miscompilation in FreeBSD's regex library.
This should fix lib/libc/regex/regcomp.c at -O3 with clang 3.3 r178860
on CPUs with SSE. Before this change, the vectorizer could incorrectly
rearrange the second loop in computejumps(), leading to possibly invalid
entries in the re_gets::charjump table.
The net result was that for example "sed s/@CC@/foo/" failed to work
correctly, leading to trouble with many configure scripts.
upcoming 3.3 release (branching and freezing expected in a few weeks).
Preliminary release notes can be found at the usual location:
<http://llvm.org/docs/ReleaseNotes.html>
An MFC is planned once the actual 3.3 release is finished.
Fix another SROA crasher, PR14601.
This was a silly oversight, we weren't pruning allocas which were used
by variable-length memory intrinsics from the set that could be widened
and promoted as integers. Fix that.
This should fix the following assertion failure:
Assertion failed: (CanSROA), function visitUsers, file
/usr/src/lib/clang/libllvmscalaropts/../../../contrib/llvm/lib/Transforms/Scalar/SROA.cpp,
line 2395.
Reported by: gerald