[X86] Remove special handling for 16 bit for A asm constraints.
Our 16 bit support is assembler-only + the terrible hack that is
.code16gcc. Simply using 32 bit registers does the right thing for
the latter.
Fixes PR32681.
This fixes some cases of assembling 16 bit code (i.e. SeaBIOS) that uses
the 'A' inline asm constraint, after r316989.
MFC after: 3 days
X-MFC-With: r316989
Use correct registers for "A" inline asm constraint
Summary:
In PR32594, inline assembly using the 'A' constraint on x86_64 causes
llvm to crash with a "Cannot select" stack trace. This is because
`X86TargetLowering::getRegForInlineAsmConstraint` hardcodes that 'A'
means the EAX and EDX registers.
However, on x86_64 it means the RAX and RDX registers, and on 16-bit
x86 (ia16?) it means the old AX and DX registers.
Add new register classes in `X86RegisterInfo.td` to support these
cases, and amend the logic in `getRegForInlineAsmConstraint` to cope
with different subtargets. Also add a test case, derived from
PR32594.
Reviewers: craig.topper, qcolombet, RKSimon, ab
Reviewed By: ab
Subscribers: ab, emaste, royger, llvm-commits
Differential Revision: https://reviews.llvm.org/D31902
This should fix crashes when using the 'A' constraint on amd64, for
example as it is being used in Xen.
Reported by: royger
MFC after: 3 days
GNU libtool checks the output from invoking the linker with --version
and --help, in order to determine the linker "flavour" and the command-
ine arguments to use for various link operations (e.g. generating shared
libraries). To detect GNU ld it looks for the strings "GNU" and
"supported targets:.*elf". Since LLD is compatible with GNU ld we
include those same strings to fool libtool.
Quoting from a comment in the change:
This is somewhat ugly hack, but in reality, we had no choice other
than doing this. Considering the very long release cycle of Libtool,
it is not easy to improve it to recognize LLD as a GNU compatible
linker in a timely manner. Even if we can make it, there are still a
lot of "configure" scripts out there that are generated by old
version of Libtool. We cannot convince every software developer to
migrate to the latest version and re-generate scripts. So we have
this hack.
Upstream LLVM revisions r298532, r298568, r298591
Obtained from: LLVM
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
[SCEV] limit recursion depth of CompareSCEVComplexity
Summary:
CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled
loop) and runs almost infinite time.
Added cache of "equal" SCEV pairs to earlier cutoff of further
estimation. Recursion depth limit was also introduced as a parameter.
Reviewers: sanjoy
Subscribers: mzolotukhin, tstellarAMD, llvm-commits
Differential Revision: https://reviews.llvm.org/D26389
Pull in r296992 from upstream llvm trunk (by Sanjoy Das):
[SCEV] Decrease the recursion threshold for CompareValueComplexity
Fixes PR32142.
r287232 accidentally increased the recursion threshold for
CompareValueComplexity from 2 to 32. This change reverses that
change by introducing a separate flag for CompareValueComplexity's
threshold.
The latter revision fixes the excessive compile times for skein_block.c.
[SCEV] limit recursion depth of CompareSCEVComplexity
Summary:
CompareSCEVComplexity goes too deep (50+ on a quite a big unrolled
loop) and runs almost infinite time.
Added cache of "equal" SCEV pairs to earlier cutoff of further
estimation. Recursion depth limit was also introduced as a parameter.
Reviewers: sanjoy
Subscribers: mzolotukhin, tstellarAMD, llvm-commits
Differential Revision: https://reviews.llvm.org/D26389
This commit is the cause of excessive compile times on skein_block.c
(and possibly other files) during kernel builds on amd64.
We never saw the problematic behavior described in this upstream commit,
so for now it is better to revert it. An upstream bug has been filed
here: https://bugs.llvm.org/show_bug.cgi?id=32142
Reported by: mjg
[ValueTracking] avoid crashing from bad assumptions (PR31809)
A program may contain llvm.assume info that disagrees with other
analysis. This may be caused by UB in the program, so we must not
crash because of that.
As noted in the code comments:
https://llvm.org/bugs/show_bug.cgi?id=31809
...we can do better, but this at least avoids the assert/crash in the
bug report.
Differential Revision: https://reviews.llvm.org/D29395
This fixes an assertion when building editors/emacs-devel.
PR: 216614
The change was made to support glibc and believed to be a no-op on
FreeBSD, but that is not the case for architectures with multiple page
sizes, such as arm64. The relro p_memsz header was rounded up to the
default maximum page size (64K). When 4K pages are in use, multiple
pages beyond the final PT_LOAD segment had their permissions changed to
read-only after application of relocations and copy relocations, which
led to a segfault in certain cases.
This reverts upstream r290986. I have started a discussion about the
upstream fix on the LLVM mailing list.
Reported by: andrew
Sponsored by: The FreeBSD Foundation
Fix use-after-free bug in AffectedValueCallbackVH::allUsesReplacedWith
When transferring affected values in the cache from an old value,
identified by the value of the current callback, to the specified new
value we might need to insert a new entry into the DenseMap which
constitutes the cache. Doing so might delete the current callback
object. Move the copying logic into a new function, a member of the
assumption cache itself, so that we don't run into UB should the
callback handle itself be removed mid-copy.
Differential Revision: https://reviews.llvm.org/D28749
This should fix crashes when building lld (as part of the llvmXY ports).
Reported by: jbeich
PR: 216117
Fix PR31644 introduced by r287138 and add a regression test.
Thanks Dimitry Andric for the report and fix!
This should restore -MP output to what it was before.
Reported by: jbeich
PR: 216043
The libgcc __register_frame and __deregister_frame functions take a
pointer to a set of FDE/CIEs, terminated by an entry where length is 0.
In Apple's libunwind implementation the pointer is taken to be to a
single FDE. I suspect this was just an Apple bug, compensated by Apple-
specific code in LLVM.
See lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp and
http://lists.llvm.org/pipermail/llvm-dev/2013-April/061737.html
for more detail.
This change is based on the LLVM RTDyldMemoryManager.cpp. It should
later be changed to be alignment-safe.
Reported by: dim
Reviewed by: dim
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D8869
Add some shortcuts in LazyValueInfo to reduce compile time of
Correlated Value Propagation.
The patch is to partially fix PR10584. Correlated Value Propagation
queries LVI to check non-null for pointer params of each callsite. If
we know the def of param is an alloca instruction, we know it is
non-null and can return early from LVI. Similarly, CVP queries LVI to
check whether pointer for each mem access is constant. If the def of
the pointer is an alloca instruction, we know it is not a constant
pointer. These shortcuts can reduce the cost of CVP significantly.
Differential Revision: https://reviews.llvm.org/D18066
This significantly reduces memory usage and compilation time when
compiling a particular C++ source file of the graphics/colmap port.
PR: 215136
MFC after: 3 days
[PowerPC] Refactor soft-float support, and enable PPC64 soft float
This change enables soft-float for PowerPC64, and also makes
soft-float disable all vector instruction sets for both 32-bit and
64-bit modes. This latter part is necessary because the PPC backend
canonicalizes many Altivec vector types to floating-point types, and
so soft-float breaks scalarization support for many operations. Both
for embedded targets and for operating-system kernels desiring
soft-float support, it seems reasonable that disabling hardware
floating-point also disables vector instructions (embedded targets
without hardware floating point support are unlikely to have Altivec,
etc. and operating system kernels desiring not to use floating-point
registers to lower syscall cost are unlikely to want to use vector
registers either). If someone needs this to work, we'll need to
change the fact that we promote many Altivec operations to act on
v4f32. To make it possible to disable Altivec when soft-float is
enabled, hardware floating-point support needs to be expressed as a
positive feature, like the others, and not a negative feature,
because target features cannot have dependencies on the disabling of
some other feature. So +soft-float has now become -hard-float.
Fixes PR26970.
Pull in r283061 from upstream clang trunk (by Hal Finkel):
[PowerPC] Enable soft-float for PPC64, and +soft-float -> -hard-float
Enable soft-float support on PPC64, as the backend now supports it.
Also, the backend now uses -hard-float instead of +soft-float, so set
the target features accordingly.
Fixes PR26970.
Reported by: Mark Millard
PR: 214433
[PPC] Set SP after loading data from stack frame, if no red zone is
present
Follow-up to r280705: Make sure that the SP is only restored after
all data is loaded from the stack frame, if there is no red zone.
This completes the fix for
https://llvm.org/bugs/show_bug.cgi?id=26519.
Differential Revision: https://reviews.llvm.org/D24466
Reported by: Mark Millard
PR: 214433
Call Frame Optimization on i386 and libunwind, by disallowing the
optimization for i386-freebsd12.
This should fix some instances of broken exception handling when frame
pointers are omitted, in particular some unittests run during the build
of editors/libreoffice.
This hack will be removed as soon as upstream has implemented a more
permanent fix for this problem.
Upstream PR: https://llvm.org/bugs/show_bug.cgi?id=30879
Reviewed by: emaste
PR: 212343
[AArch64] Don't blindly lower f16/f128 FCCMPs.
Instead, extend f16 (like we do when lowering a standalone SETCC),
and let f128 be legalized to the RT calls.
Fixes PR26803.
This fixes a fatal "Cannot select" backend error when building the
net/freerdp port for AArch64.
PR: 214380
MFC after: 3 days
[AArch64] PR28877: Don't assume we're running after legalization when
creating vcvtfp2fxs
Summary:
The DAG combine transformation that was generating the
aarch64_neon_vcvtfp2fxs node was assuming that all inputs where legal
and wasn't accounting that the input could be a v4f64 if we're trying
to do the transformation before legalization. We now bail out in this
case.
All illegal types besides v4f64 were already rejected.
Fixes https://llvm.org/bugs/show_bug.cgi?id=28877
Reviewers: jmolloy
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: https://reviews.llvm.org/D23261
This fixes several ports on AArch64.
Requested by: andrew
MFC after: 3 days
Previously most messages included a newline in the string, but a few of
them were missing. Fix these and simplify by just adding the newline in
the _LIBUNWIND_LOG macro itself.
While here correct 'libuwind' typo (missing 'n').
Upstream LLVM libunwind commits r280086 and r280103.
[x86] don't try to create a vector integer inst for an SSE1 target
(PR30512)
This bug was introduced with:
http://reviews.llvm.org/rL272511
We need to restrict the lowering to v4f32 comparisons because that's
all SSE1 can handle.
This should fix:
https://llvm.org/bugs/show_bug.cgi?id=28044
This avoids a "Do not know how to custom type legalize this operation"
error when building the multimedia/ffmpeg port on i386 with SSE enabled.
[PPC] Claim stack frame before storing into it, if no red zone is
present
Unlike PPC64, PPC32/SVRV4 does not have red zone. In the absence of
it there is no guarantee that this part of the stack will not be
modified by any interrupt. To avoid this, make sure to claim the
stack frame first before storing into it.
This fixes https://llvm.org/bugs/show_bug.cgi?id=26519.
Differential Revision: https://reviews.llvm.org/D24093
Add ISD::EH_DWARF_CFA, simplify @llvm.eh.dwarf.cfa on Mips, fix on
PowerPC
LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the
GCC-compatible __builtin_dwarf_cfa() builtin. As pointed out in
PR26761, this is currently broken on PowerPC (and likely on ARM as
well). Currently, @llvm.eh.dwarf.cfa is lowered using:
ADD(FRAMEADDR, FRAME_TO_ARGS_OFFSET)
where FRAME_TO_ARGS_OFFSET defaults to the constant zero. On x86,
FRAME_TO_ARGS_OFFSET is lowered to 2*SlotSize. This setup, however,
does not work for PowerPC. Because of the way that the stack layout
works, the canonical frame address is not exactly (FRAMEADDR +
FRAME_TO_ARGS_OFFSET) on PowerPC (there is a lower save-area offset
as well), so it is not just a matter of implementing
FRAME_TO_ARGS_OFFSET for PowerPC (unless we redefine its semantics --
We can do that, since it is currently used only for
@llvm.eh.dwarf.cfa lowering, but the better to directly lower the CFA
construct itself (since it can be easily represented as a
fixed-offset FrameIndex)). Mips currently does this, but by using a
custom lowering for ADD that specifically recognizes the (FRAMEADDR,
FRAME_TO_ARGS_OFFSET) pattern.
This change introduces a ISD::EH_DWARF_CFA node, which by default
expands using the existing logic, but can be directly lowered by the
target. Mips is updated to use this method (which simplifies its
implementation, and I suspect makes it more robust), and updates
PowerPC to do the same.
Fixes PR26761.
Differential Revision: https://reviews.llvm.org/D24038
[PowerPC] Don't spill the frame pointer twice
When a function contains something, such as inline asm, which
explicitly clobbers the register used as the frame pointer, don't
spill it twice. If we need a frame pointer, it will be saved/restored
in the prologue/epilogue code. Explicitly spilling it again will
reuse the same spill slot used by the prologue/epilogue code, thus
clobbering the saved value. The same applies to the base-pointer or
PIC-base register.
Partially fixes PR26856. Thanks to Ulrich for his analysis and the
small inline-asm reproducer.
[PowerPC] Add support for -mlongcall
The "long call" option forces the use of the indirect calling
sequence for all calls (even those that don't really need it). GCC
provides this option; This is helpful, under certain circumstances,
for building very-large binaries, and some other specialized use
cases.
Fixes PR19098.
Pull in r280041 from upstream clang trunk (by Hal Finkel):
[PowerPC] Add support for -mlongcall
Add support for GCC's PowerPC -mlongcall option; the backend supports
the corresponding target feature as of r280040.
Fixes PR19098.
Don't reduce the width of vector mul if the target doesn't support
SSE2.
The patch is to fix PR30298, which is caused by rL272694. The
solution is to bail out if the target has no SSE2.
Differential Revision: https://reviews.llvm.org/D24288
This fixes building the multimedia/libx264 port on i386.
[AArch64] Return the correct size for TLSDESC_CALLSEQ
The branch relaxation pass is computing the wrong offsets because it assumes
TLSDESC_CALLSEQ eats up 4 bytes, when in fact it is lowered to an instruction
sequence taking up 16 bytes. This can become a problem in huge files with lots
of TLS accesses, as it may slowly move branch targets out of the range computed
by the branch relaxation pass.
Fixes PR24234 https://llvm.org/bugs/show_bug.cgi?id=24234
Differential Revision: https://reviews.llvm.org/D22870
This fixes "error in backend: fixup value out of range" when compiling
the misc/talkfilters port for AArch64.
Reported by: sbruno
PR: 201762
MFC after: 3 days
positive diagnostics from -Wvarargs about enum parameters, e.g.:
cddl/contrib/opensolaris/lib/libnvpair/libnvpair.c:388:15: error: passing an object that undergoes default argument promotion to 'va_start' has undefined behavior
[-Werror,-Wvarargs]
va_start(ap, which);
^
cddl/contrib/opensolaris/lib/libnvpair/libnvpair.c:382:66: note: parameter of type 'enum nvlist_prtctl_fmt' is declared here
nvlist_prtctl_dofmt(nvlist_prtctl_t pctl, enum nvlist_prtctl_fmt which, ...)
^
Fix for pr24346: arm asm label calculation error in sub
Some ARM instructions encode 32-bit immediates as a 8-bit integer
(0-255) and a 4-bit rotation (0-30, even) in its least significant 12
bits. The original fixup, FK_Data_4, patches the instruction by the
value bit-to-bit, regardless of the encoding. For example, assuming
the label L1 and L2 are 0x0 and 0x104 respectively, the following
instruction:
add r0, r0, #(L2 - L1) ; expects 0x104, i.e., 260
would be assembled to the following, which adds 1 to r0, instead of
260:
e2800104 add r0, r0, #4, 2 ; equivalently 1
The new fixup kind fixup_arm_mod_imm takes care of the encoding:
e2800f41 add r0, r0, #260
Patch by Ting-Yuan Huang!
This fixes label calculation for ARM assembly, and is needed to enable
ARM assembly sources for OpenSSL.
Requested by: jkim
MFC after: 3 days
[X86] AMD Bobcat CPU (btver1) doesn't support XSAVE
btver1 is a SSSE3/SSE4a only CPU - it doesn't have AVX and doesn't
support XSAVE.
Differential Revision: http://reviews.llvm.org/D17682
Pull in r262782 from upstream llvm trunk (by Simon Pilgrim):
[X86] AMD Bobcat CPU (btver1) doesn't support XSAVE
btver1 is a SSSE3/SSE4a only CPU - it doesn't have AVX and doesn't
support XSAVE.
Differential Revision: http://reviews.llvm.org/D17683
This ensures clang does not emit AVX instructions for CPUTYPE=btver1.
Reported by: Michel Depeige <demik+freebsd@lostwave.net>
PR: 211864
MFC after: 3 days
_Unwind_Exception is required to be double word aligned. GCC has
interpreted this to mean "use the maximum useful alignment for the
target" so follow that lead.
Obtained from: LLVM review D22543
FreeBSD uses LLVM's libunwind on FreeBSD/arm64 today (and we expect to
use it more widely in the future) and it requires the EH frame segment
in static binaries.
Reviewed by: dim
Obtained from: Clang commit r266123
MFC after: 3 days
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D7250
For historical reasons Darwin/i386 has ebp and esp swapped in the
eh_frame register numbering. That is:
Darwin Other
Reg # eh_frame eh_frame DWARF
===== ======== ======== =====
4 ebp esp esp
5 esp ebp ebp
Although the UNW_X86_* constants are not supposed to be coupled to
DWARF / eh_frame numbering they are currently conflated in LLVM
libunwind, and thus we require the non-Darwin numbering.
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
This may be reworked upstream but in the interim should address the
stack usage issue reported in the PR.
PR: 206384
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
The key improvement is that it may be built without cross-unwinding
support, which significantly reduces the stack space requirement.
MFC after: 1 week
Relnotes: yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D7123
Only attempt to detect AVG if SSE2 is available
Summary:
In PR29973 Sanjay Patel reported an assertion failure when a certain
loop was optimized, for a target without SSE2 support. It turned out
this was because of the AVG pattern detection introduced in rL253952.
Prevent the assertion failure by bailing out early in
`detectAVGPattern()`, if the target does not support SSE2.
Also add a minimized test case.
Reviewers: congh, eli.friedman, spatel
Subscribers: emaste, llvm-commits
Differential Revision: http://reviews.llvm.org/D20905
This should fix assertion failures ("Requires at least SSE2!") when
building the games/0ad port with CPUTYPE=pentium3.
Reported by: madpilot