[x86] Fix wrong lowering of vsetcc nodes (PR25080).
Function LowerVSETCC (in X86ISelLowering.cpp) worked under the wrong
assumption that for non-AVX512 targets, the source type and destination type
of a type-legalized setcc node were always the same type.
This assumption was unfortunately incorrect; the type legalizer is not always
able to promote the return type of a setcc to the same type as the first
operand of a setcc.
In the case of a vsetcc node, the legalizer firstly checks if the first input
operand has a legal type. If so, then it promotes the return type of the vsetcc
to that same type. Otherwise, the return type is promoted to the 'next legal
type', which, for vectors of MVT::i1 is always a 128-bit integer vector type.
Example (-mattr=+avx):
%0 = trunc <8 x i32> %a to <8 x i23>
%1 = icmp eq <8 x i23> %0, zeroinitializer
The initial selection dag for the code above is:
v8i1 = setcc t5, t7, seteq:ch
t5: v8i23 = truncate t2
t2: v8i32,ch = CopyFromReg t0, Register:v8i32 %vreg1
t7: v8i32 = build_vector of all zeroes.
The type legalizer would firstly check if 't5' has a legal type. If so, then it
would reuse that same type to promote the return type of the setcc node.
Unfortunately 't5' is of illegal type v8i23, and therefore it cannot be used to
promote the return type of the setcc node. Consequently, the setcc return type
is promoted to v8i16. Later on, 't5' is promoted to v8i32 thus leading to the
following dag node:
v8i16 = setcc t32, t25, seteq:ch
where t32 and t25 are now values of type v8i32.
Before this patch, function LowerVSETCC would have wrongly expanded the setcc
to a single X86ISD::PCMPEQ. Surprisingly, ISel was still able to match an
instruction. In our case, ISel would have matched a VPCMPEQWrr:
t37: v8i16 = X86ISD::VPCMPEQWrr t36, t25
However, t36 and t25 are both VR256, while the result type is instead of class
VR128. This inconsistency ended up causing the insertion of COPY instructions
like this:
%vreg7<def> = COPY %vreg3; VR128:%vreg7 VR256:%vreg3
Which is an invalid full copy (not a sub register copy).
Eventually, the backend would have hit an UNREACHABLE "Cannot emit physreg copy
instruction" in the attempt to expand the malformed pseudo COPY instructions.
This patch fixes the problem adding the missing logic in LowerVSETCC to handle
the corner case of a setcc with 128-bit return type and 256-bit operand type.
This problem was originally reported by Dimitry as PR25080. It has been latent
for a very long time. I have added the minimal reproducible from that bugzilla
as test setcc-lowering.ll.
Differential Revision: http://reviews.llvm.org/D13660
This should fix the "Cannot emit physreg copy instruction" errors when
compiling contrib/wpa/src/common/ieee802_11_common.c, and CPUTYPE is set
to a CPU supporting AVX (e.g. sandybridge, ivybridge).
Formal release notes are available:
https://subversion.apache.org/docs/release-notes/1.9.html
Of particular note, the client checkout format has *not* changed so
upgrades should *not* be required.
When reading a repository (file:// or running as a local server), an
improved fsfs version 7 is available with significant performance
improvements. An optional upgrade is possible to use the new features.
Without the upgrade, this is fully read/write compatible with the
version 6 fsfs as in svn-1.8.
Relnotes: yes
From OpenBSD's commit log:
This was responsible for memory corruption with recent versions
of Mesa where c and c++ code share a header with a packed enum type.
Reference:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39219
Obtained from: OpenBSD (CVS rev. 1.2)
MFC after: 1 week
Enable and fix warnings during the build.
Although CMake adds warning flags, they are ignored in the libc++ headers
because the headers '#pragma system header' themselves.
This patch disables the system header pragma when building libc++ and fixes
the warnings that arose.
The warnings fixed were:
1. <memory> - anonymous structs are a GNU extension
2. <functional> - anonymous structs are a GNU extension.
3. <__hash_table> - Embedded preprocessor directives have undefined behavior.
4. <string> - Definition is missing noexcept from declaration.
5. <__std_stream> - Unused variable.
This should fix building world (in particular libatf-c++) with -std=c++11.
Reported by: Oliver Hartmann <ohartman@zedat.fu-berlin.de>
[SLP] Vectorize for all-constant entries.
This should fix libc++'s iostream initialization SIGBUSing on amd64,
whenever the global cout symbol is not aligned to 16 bytes.
Some further explanation: libc++'s iostream.cpp contains the definitions
of std::cout, std::cerr and so on. These global objects are effectively
declared with an alignment of 8 bytes. When an executable is linked
against libc++.so, it can sometimes get a copy of the global object,
which is then at the same alignment.
However, with clang 3.7.0, the initialization of these global objects
will incorrectly use SSE instructions (e.g. movdqa), whenever the
optimization level is high enough, and SSE is enabled, such as on amd64.
When any of these objects is not aligned to 16 bytes, this will result
in a SIGBUS during iostream initialization. In contrast, clang 3.6.x
and earlier took the 8 byte alignment into consideration, and avoided
SSE for those particular operations.
After bisecting of upstream changes, I found that the above revision
caused the change of this behavior, so I am reverting it now as a
workaround, while a discussion and test case is being prepared for
upstream.
Highlights (not already in the FreeBSD tree):
- addr2line: Fixed multiple memory leaks related to DIE allocation
- readelf: improve sh_link validation
- various man page improvements
Sponsored by: The FreeBSD Foundation
set div/rem default values to 'expensive' in TargetTransformInfo's
cost model
...because that's what the cost model was intended to do.
As discussed in D12882, this fix has a temporary unintended
consequence for SimplifyCFG: it causes us to not speculate an fdiv.
However, two wrongs make PR24818 right, and two wrongs make PR24343
act right even though it's really still wrong.
I intend to correct SimplifyCFG and add to CodeGenPrepare to account
for this cost model change and preserve the righteousness for the bug
report cases.
https://llvm.org/bugs/show_bug.cgi?id=24818https://llvm.org/bugs/show_bug.cgi?id=24343
Differential Revision: http://reviews.llvm.org/D12882
This fixes the too-eager fdiv hoisting in pow(), which could lead to
unexpected floating point exceptions.
r288125, the required atomic library calls are available in compiler-rt.
The added stub for __libcpp_relaxed_store() can stay as a fallback; I
have also committed it upstream.
Add missing atomic libcall support.
Support for emitting libcalls for __atomic_fetch_nand and
__atomic_{add,sub,and,or,xor,nand}_fetch was missing; add it, and some
test cases.
Differential Revision: http://reviews.llvm.org/D10847
This fixes "cannot compile this atomic library call yet" errors when
compiling code which calls the above builtins, on arm < v6.
in libc++, on __ARM_ARCH < 6. Additionally, supply the missing stub
__libcpp_relaxed_store(), as proposed in http://reviews.llvm.org/D13051
NOTE: this needs to be fixed properly later on, by supplying library
functions implementing atomic operations for arm < v6. We should
probably take those from sys/arm/arm/stdatomic.c, and stuff them into
either libgcc or compiler-rt.
Some binaries (such as the FreeBSD kernel) contain a mixture of CUs
with and without debug information. Previously translate() exited upon
encountering a CU without debug information. Instead, just move on to
the next CU.
Reported by: royger
Reviewed by: royger
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D3712
both in /usr/lib and /usr/local/lib, thus simplifying the use of modules
from ports, without breaking the compat32 case again.
PR: 191151
MFC after: 3 weeks