bugfix-only release, with no new features.
Please note that from 3.5.0 onwards, clang and llvm require C++11
support to build; see UPDATING for more information.
breakpoint. The value doesn't need to be adjusted as it is already
correctly returned from the kernel.
This allows lldb to set breakpoints, and stop on them, however more work
is needed, for example single stepping fails to stop.
Discussed with: emaste
Change this to DWARF2, in the simplest way possible. (Upstream, this
was fixed in clang trunk r250173, but this was done along with a lot of
shuffling around of debug option handling, so it cannot be applied
as-is.)
Noticed by: des
MFC after: 3 days
Refactor library decision for -fopenmp support from Darwin into a
function for sharing with other platforms.
Pull in r248424 from upstream clang trunk (by Jörg Sonnenberger):
Push OpenMP linker flags after linker input on Darwin. Don't add any
libraries if -nostdlib is specified. Test.
Pull in r248426 from upstream clang trunk (by Jörg Sonnenberger):
Support linking against OpenMP runtime on NetBSD.
Pull in r250657 from upstream clang trunk (by Dimitry Andric):
Support linking against OpenMP runtime on FreeBSD.
[x86] Fix wrong lowering of vsetcc nodes (PR25080).
Function LowerVSETCC (in X86ISelLowering.cpp) worked under the wrong
assumption that for non-AVX512 targets, the source type and destination type
of a type-legalized setcc node were always the same type.
This assumption was unfortunately incorrect; the type legalizer is not always
able to promote the return type of a setcc to the same type as the first
operand of a setcc.
In the case of a vsetcc node, the legalizer firstly checks if the first input
operand has a legal type. If so, then it promotes the return type of the vsetcc
to that same type. Otherwise, the return type is promoted to the 'next legal
type', which, for vectors of MVT::i1 is always a 128-bit integer vector type.
Example (-mattr=+avx):
%0 = trunc <8 x i32> %a to <8 x i23>
%1 = icmp eq <8 x i23> %0, zeroinitializer
The initial selection dag for the code above is:
v8i1 = setcc t5, t7, seteq:ch
t5: v8i23 = truncate t2
t2: v8i32,ch = CopyFromReg t0, Register:v8i32 %vreg1
t7: v8i32 = build_vector of all zeroes.
The type legalizer would firstly check if 't5' has a legal type. If so, then it
would reuse that same type to promote the return type of the setcc node.
Unfortunately 't5' is of illegal type v8i23, and therefore it cannot be used to
promote the return type of the setcc node. Consequently, the setcc return type
is promoted to v8i16. Later on, 't5' is promoted to v8i32 thus leading to the
following dag node:
v8i16 = setcc t32, t25, seteq:ch
where t32 and t25 are now values of type v8i32.
Before this patch, function LowerVSETCC would have wrongly expanded the setcc
to a single X86ISD::PCMPEQ. Surprisingly, ISel was still able to match an
instruction. In our case, ISel would have matched a VPCMPEQWrr:
t37: v8i16 = X86ISD::VPCMPEQWrr t36, t25
However, t36 and t25 are both VR256, while the result type is instead of class
VR128. This inconsistency ended up causing the insertion of COPY instructions
like this:
%vreg7<def> = COPY %vreg3; VR128:%vreg7 VR256:%vreg3
Which is an invalid full copy (not a sub register copy).
Eventually, the backend would have hit an UNREACHABLE "Cannot emit physreg copy
instruction" in the attempt to expand the malformed pseudo COPY instructions.
This patch fixes the problem adding the missing logic in LowerVSETCC to handle
the corner case of a setcc with 128-bit return type and 256-bit operand type.
This problem was originally reported by Dimitry as PR25080. It has been latent
for a very long time. I have added the minimal reproducible from that bugzilla
as test setcc-lowering.ll.
Differential Revision: http://reviews.llvm.org/D13660
This should fix the "Cannot emit physreg copy instruction" errors when
compiling contrib/wpa/src/common/ieee802_11_common.c, and CPUTYPE is set
to a CPU supporting AVX (e.g. sandybridge, ivybridge).
[SLP] Vectorize for all-constant entries.
This should fix libc++'s iostream initialization SIGBUSing on amd64,
whenever the global cout symbol is not aligned to 16 bytes.
Some further explanation: libc++'s iostream.cpp contains the definitions
of std::cout, std::cerr and so on. These global objects are effectively
declared with an alignment of 8 bytes. When an executable is linked
against libc++.so, it can sometimes get a copy of the global object,
which is then at the same alignment.
However, with clang 3.7.0, the initialization of these global objects
will incorrectly use SSE instructions (e.g. movdqa), whenever the
optimization level is high enough, and SSE is enabled, such as on amd64.
When any of these objects is not aligned to 16 bytes, this will result
in a SIGBUS during iostream initialization. In contrast, clang 3.6.x
and earlier took the 8 byte alignment into consideration, and avoided
SSE for those particular operations.
After bisecting of upstream changes, I found that the above revision
caused the change of this behavior, so I am reverting it now as a
workaround, while a discussion and test case is being prepared for
upstream.
set div/rem default values to 'expensive' in TargetTransformInfo's
cost model
...because that's what the cost model was intended to do.
As discussed in D12882, this fix has a temporary unintended
consequence for SimplifyCFG: it causes us to not speculate an fdiv.
However, two wrongs make PR24818 right, and two wrongs make PR24343
act right even though it's really still wrong.
I intend to correct SimplifyCFG and add to CodeGenPrepare to account
for this cost model change and preserve the righteousness for the bug
report cases.
https://llvm.org/bugs/show_bug.cgi?id=24818https://llvm.org/bugs/show_bug.cgi?id=24343
Differential Revision: http://reviews.llvm.org/D12882
This fixes the too-eager fdiv hoisting in pow(), which could lead to
unexpected floating point exceptions.
Add missing atomic libcall support.
Support for emitting libcalls for __atomic_fetch_nand and
__atomic_{add,sub,and,or,xor,nand}_fetch was missing; add it, and some
test cases.
Differential Revision: http://reviews.llvm.org/D10847
This fixes "cannot compile this atomic library call yet" errors when
compiling code which calls the above builtins, on arm < v6.
Reverting r239883 and r240720:
------------------------------------------------------------------------
r239883 | echristo | 2015-06-17 00:09:32 -0700 (Wed, 17 Jun 2015) | 16 lines
Update the intel intrinsic headers to use the target attribute support.
This involved removing the conditional inclusion and replacing them
with target attributes matching the original conditional inclusion
and checks. The testcase update removes the macro checks for each
file and replaces them with usage of the __target__ attribute, e.g.:
int __attribute__((__target__(("sse3")))) foo(int a) {
_mm_mwait(0, 0);
return 4;
}
This usage does require the enclosing function have the requisite
__target__ attribute for inlining and code generation - also for
any macro intrinsic uses in the enclosing function. There's no change
for existing uses of the intrinsic headers.
------------------------------------------------------------------------
------------------------------------------------------------------------
r240720 | silvas | 2015-06-25 16:22:11 -0700 (Thu, 25 Jun 2015) | 6 lines
Remove `requires` for x86 CPU features.
Ever since the target attributes change, we don't need to guard these
headers with `requires`. Actually it's a bit worse, because if we do
then they are included textually under the covers, causing declarations
to appear in submodules they aren't supposed to be in.
------------------------------------------------------------------------
This reverts the changes to the intrinsics headers in trunk, which could
result in some ports' configure scripts misdetecting SSE (and higher)
support.
[SCCP] Turn loads of null into undef instead of zero initialized values
Surprisingly, this is a correctness issue: the mmx type exists for
calling convention purposes, LLVM doesn't have a zero representation for
them.
This partially fixes PR23999.
Pull in r241143 from upstream llvm trunk (by David Majnemer):
[LoopUnroll] Use undef for phis with no value live
We would create a phi node with a zero initialized operand instead of
undef in the case where no value was originally available. This was
problematic for x86_mmx which has no null value.
These fix a "Cannot create a null constant of that type!" error when
compiling the graphics/sdl2_gfx port with MMX enabled.
Reported by: amdmi3
Notable upstream commits (upstream revision in parens):
- Add a JSON producer to LLDB (228636)
- Don't crash on bad DWARF expression (228729)
- Add support of DWARFv3 DW_OP_form_tls_address (231342)
- Assembly profiler for MIPS64 (232619)
- Handle FreeBSD/arm64 core files (233273)
- Read/Write register for MIPS64 (233685)
- Rework LLDB system initialization (233758)
- SysV ABI for aarch64 (236098)
- MIPS software single stepping (236696)
- FreeBSD/arm live debugging support (237303)
- Assembly profiler for mips32 (237420)
- Parse function name from DWARF DW_AT_abstract_origin (238307)
- Improve LLDB prompt handling (238313)
- Add real time signals support to FreeBSDSignals (238316)
- Fix race in IOHandlerProcessSTDIO (238423)
- MIPS64 Branch instruction emulation for SW single stepping (238820)
- Improve OSType initialization in elf object file's arch_spec (239148)
- Emulation of MIPS64 floating-point branch instructions (239996)
- ABI Plugin for MIPS32 (239997)
- ABI Plugin for MIPS64 (240123)
- MIPS32 branch emulation and single stepping (240373)
- Improve instruction emulation based stack unwinding on ARM (240533)
- Add branch emulation to aarch64 instruction emulator (240769)
MC: Allow multiple comma-separated expressions on the .uleb128 directive.
For compatiblity with GNU as. Binutils documents this as
'.uleb128 expressions'. Subtle, isn't it?
Reported by: sbruno
PR: 199554
MFC after: 3 days
Fix assert instantiating string init of static variable
... when the variable's type is a typedef of a ConstantArrayType. Just
look through the typedef (and any other sugar). We only use the
constant array type here to get the element count.
This fixes an assertion failure when building the games/redeclipse port.
Reported by: amdmi3
As is described at http://llvm.org/bugs/show_bug.cgi?id=22408, the GNU
linkers ld.bfd and ld.gold currently only support a subset of the
whole range of AArch64 ELF TLS relocations. Furthermore, they assume
that some of the code sequences to access thread-local variables are
produced in a very specific sequence. When the sequence is not as the
linker expects, it can silently mis-relaxe/mis-optimize the
instructions.
Even if that wouldn't be the case, it's good to produce the exact
sequence, as that ensures that linkers can perform optimizing
relaxations.
This patch:
* implements support for 16MiB TLS area size instead of 4GiB TLS area
size. Ideally clang would grow an -mtls-size option to allow support
for both, but that's not part of this patch.
* by default doesn't produce local dynamic access patterns, as even
modern ld.bfd and ld.gold linkers do not support the associated
relocations. An option (-aarch64-elf-ldtls-generation) is added to
enable generation of local dynamic code sequence, but is off by
default.
* makes sure that the exact expected code sequence for local dynamic
and general dynamic accesses is produced, by making use of a new
pseudo instruction. The patch also removes two
(AArch64ISD::TLSDESC_BLR, AArch64ISD::TLSDESC_CALL) pre-existing
AArch64-specific pseudo SDNode instructions that are superseded by
the new one (TLSDESC_CALLSEQ).
Submitted by: Kristof Beyls
Differential Revision: https://reviews.freebsd.org/D2175
ARM: treat [N x i32] and [N x i64] as AAPCS composite types
The logic is almost there already, with our special homogeneous
aggregate handling. Tweaking it like this allows front-ends to emit
AAPCS compliant code without ever having to count registers or add
discarded padding arguments.
Only arrays of i32 and i64 are needed to model AAPCS rules, but I
decided to apply the logic to all integer arrays for more consistency.
This fixes a possible "Unexpected member type for HA" error when
compiling lib/msun/bsdsrc/b_tgamma.c for armv6.
Reported by: Jakub Palider <jpa@semihalf.com>
These are generated, and not "optimized" in any way, since I am not
entirely sure of the syntax or format of this type of file. Feel free
to suggest ways of shortening these lists.
The general idea is the same for all three files, though:
* Get rid of upstream build infrastructure (CMakeLists, Makefiles, etc)
* Delete tests, tools and utilities we don't want or use (including
samples)
* Remove various bits of upstream metadata files that we don't want or
use (.arcconfig, .gitignore, etc)
LoopRotate: When reconstructing loop simplify form don't split edges
from indirectbrs.
Yet another chapter in the endless story. While this looks like we
leave the loop in a non-canonical state this replicates the logic in
LoopSimplify so it doesn't diverge from the canonical form in any way.
http://llvm.org/PR21968
This fixes a "Cannot split critical edge from IndirectBrInst" assertion
failure when building the devel/radare2 port.
PR: 195480, 196987
MFC after: 3 days
FreeBSD core files have no section table and thus LLDB's OS and vendor
detection logic does not work. If we encounter such an ELF file, update
an unknown OS to match the host.
This is not really the correct way to handle this, but more extensive
rework of ObjectFileELF will be needed and this change restores cross-
arch core debugging until that can be completed.
There's an unfortunate layering issue between LLDB's Process/POSIX and
Process/{FreeBSD,Linux}, exposed by a refactoring in upstream revision
218568. Work around it by adding explicit #if defined(__FreeBSD__)
guards to include the correct header.
[mips] Enable arithmetic and binary operations for the i128 data type.
Summary:
This patch adds support for some operations that were missing from
128-bit integer types (add/sub/mul/sdiv/udiv... etc.). With these
changes we can support the __int128_t and __uint128_t data types
from C/C++.
Depends on D7125
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D7143
This fixes "error in backend" messages, when compiling parts of
compiler-rt using 128-bit integer types for mips64.
Reported by: sbruno
PR: 197259
[FastIsel][X86] Fix invalid register replacement for bool args
Summary:
Consider the following IR:
%3 = load i8* undef
%4 = trunc i8 %3 to i1
%5 = call %jl_value_t.0* @foo(..., i1 %4, ...)
ret %jl_value_t.0* %5
Bools (that are the result of direct truncs) are lowered as whatever
the argument to the trunc was and a "and 1", causing the part of the
MBB responsible for this argument to look something like this:
%vreg8<def,tied1> = AND8ri %vreg7<kill,tied0>, 1, %EFLAGS<imp-def>; GR8:%vreg8,%vreg7
Later, when the load is lowered, it will insert
%vreg15<def> = MOV8rm %vreg14, 1, %noreg, 0, %noreg; mem:LD1[undef] GR8:%vreg15 GR64:%vreg14
but remember to (at the end of isel) replace vreg7 by vreg15. Now for
the bug. In fast isel lowering, we mistakenly mark vreg8 as the result
of the load instead of the trunc. This adds a fixup to have
vreg8 replaced by whatever the result of the load is as well, so
we end up with
%vreg15<def,tied1> = AND8ri %vreg15<kill,tied0>, 1, %EFLAGS<imp-def>; GR8:%vreg15
which is an SSA violation and causes problems later down the road.
This fixes PR21557.
Test Plan: Test test case from PR21557 is added to the test suite.
Reviewers: ributzka
Reviewed By: ributzka
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6245
This fixes a possible assertion failure when compiling toolbox.cxx from
LibreOffice 4.3.5.
Reported by: kwm
[X86] Convert esp-relative movs of function arguments to pushes, step 2
This moves the transformation introduced in r223757 into a separate MI pass.
This allows it to cover many more cases (not only cases where there must be a
reserved call frame), and perform rudimentary call folding. It still doesn't
have a heuristic, so it is enabled only for optsize/minsize, with stack
alignment <= 8, where it ought to be a fairly clear win.
(Re-commit of r227728)
Differential Revision: http://reviews.llvm.org/D6789
This helps to get sys/boot/i386/boot2 below the required size again,
when optimizing with -Oz.
Allows Clang to use LLVM's fixes-x18 option
This patch allows clang to have llvm reserve the x18
platform register on AArch64. FreeBSD will use this in the kernel for
per-cpu data but has no need to reserve this register in userland so
will need this flag to reserve it.
This uses llvm r226664 to allow this register to be reserved.
Patch by Andrew Turner.
Requested by: andrew
AArch64: add backend option to reserve x18 (platform register)
AAPCS64 says that it's up to the platform to specify whether x18 is
reserved, and a first step on that way is to add a flag controlling
it.
From: Andrew Turner <andrew@fubar.geek.nz>
Requested by: andrew
triple ids
This only allows testing and does not change the defaults for mips/mips64.
They still build/use gcc by default.
Differential Revision: https://reviews.freebsd.org/D1190
Reviewed by: dim
[Aarch64] Customer lowering of CTPOP to SIMD should check for NEON
availability
This ensures llvm's AArch64 backend does not emit floating point
instructions if they are disabled.
Fix transformation of add with pc argument to adr for non-immediate
arguments.
This fixes an "Unimplemented" error when assembling certain ARM add
instructions with pc-relative arguments.
Reported by: sbruno
PR: 196412, 196423
PR20228: don't retain a pointer to a vector element after the
container has been resized.
This fixes a possible crash when compiling certain parts of libc++'s
type_traits header.
PowerPC: CTR shouldn't fire if a TLS call is in the loop
Determining the address of a TLS variable results in a function call in
certain TLS models. This means that a simple ICmpInst might actually
result in invalidating the CTR register.
In such cases, do not attempt to rely on the CTR register for loop
optimization purposes.
This fixes PR22034.
Differential Revision: http://reviews.llvm.org/D6786
This fixes a "Invalid PPC CTR loop" error when compiling parts of libc
for PowerPC-32.
[PowerPC] Replace foul hackery with real calls to __tls_get_addr
My original support for the general dynamic and local dynamic TLS
models contained some fairly obtuse hacks to generate calls to
__tls_get_addr when lowering a TargetGlobalAddress. Rather than
generating real calls, special GET_TLS_ADDR nodes were used to wrap
the calls and only reveal them at assembly time. I attempted to
provide correct parameter and return values by chaining CopyToReg and
CopyFromReg nodes onto the GET_TLS_ADDR nodes, but this was also not
fully correct. Problems were seen with two back-to-back stores to TLS
variables, where the call sequences ended up overlapping with unhappy
results. Additionally, since these weren't real calls, the proper
register side effects of a call were not recorded, so clobbered values
were kept live across the calls.
The proper thing to do is to lower these into calls in the first
place. This is relatively straightforward; see the changes to
PPCTargetLowering::LowerGlobalTLSAddress() in PPCISelLowering.cpp.
The changes here are standard call lowering, except that we need to
track the fact that these calls will require a relocation. This is
done by adding a machine operand flag of MO_TLSLD or MO_TLSGD to the
TargetGlobalAddress operand that appears earlier in the sequence.
The calls to LowerCallTo() eventually find their way to
LowerCall_64SVR4() or LowerCall_32SVR4(), which call FinishCall(),
which calls PrepareCall(). In PrepareCall(), we detect the calls to
__tls_get_addr and immediately snag the TargetGlobalTLSAddress with
the annotated relocation information. This becomes an extra operand
on the call following the callee, which is expected for nodes of type
tlscall. We change the call opcode to CALL_TLS for this case. Back
in FinishCall(), we change it again to CALL_NOP_TLS for 64-bit only,
since we require a TOC-restore nop following the call for the 64-bit
ABIs.
During selection, patterns in PPCInstrInfo.td and PPCInstr64Bit.td
convert the CALL_TLS nodes into BL_TLS nodes, and convert the
CALL_NOP_TLS nodes into BL8_NOP_TLS nodes. This replaces the code
removed from PPCAsmPrinter.cpp, as the BL_TLS or BL8_NOP_TLS
nodes can now be emitted normally using their patterns and the
associated printTLSCall print method.
Finally, as a result of these changes, all references to get-tls-addr
in its various guises are no longer used, so they have been removed.
There are existing TLS tests to verify the changes haven't messed
anything up). I've added one new test that verifies that the problem
with the original code has been fixed.
This fixes a fatal "Bad machine code" error when compiling parts of
libgomp for 32-bit PowerPC.
Add parsing of 'foo@local".
Summary:
Currently, it supports generating, but not parsing, this expression.
Test added as well.
Test Plan: New test added, no regressions due to this.
Reviewers: hfinkel
Reviewed By: hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D6672
Pull in r224494 from upstream llvm trunk (by Justin Hibbits):
Add a corresponding '@LOCAL' parse to match r224415.
Pointed out by Jim Grosbach.
[PowerPC] Add JMP_SLOT relocation definitions
This will be required by upcoming patches for LLDB support.
Patch by Justin Hibbits!
Pull in r221510 from upstream llvm trunk (by Justin Hibbits):
Add Position-independent Code model Module API.
Summary:
This makes PIC levels a Module flag attribute, which can be queried by the
backend. The flag is named `PIC Level`, and can have a value of:
0 - Backend-default
1 - Small-model (-fpic)
2 - Large-model (-fPIC)
These match the `-pic-level' command line argument for clang, and the value of the
preprocessor macro `__PIC__'.
Test Plan:
New flags tests specific for the 'PIC Level' module flag.
Tests to be added as part of a future commit for PowerPC, which will use this new API.
Reviewers: rafael, echristo
Reviewed By: rafael, echristo
Subscribers: rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D5882
Pull in r221791 from upstream llvm trunk (by Justin Hibbits):
Add support for small-model PIC for PowerPC.
Summary:
Large-model was added first. With the addition of support for multiple PIC
models in LLVM, now add small-model PIC for 32-bit PowerPC, SysV4 ABI. This
generates more optimal code, for shared libraries with less than about 16380
data objects.
Test Plan: Test cases added or updated
Reviewers: joerg, hfinkel
Reviewed By: hfinkel
Subscribers: jholewinski, mcrosier, emaste, llvm-commits
Differential Revision: http://reviews.llvm.org/D5399
Together, these changes implement small-model PIC support for PowerPC.
Thanks to Justin Hibbits and Roman Divacky for their assistance in
getting this working.
Implement vaarg lowering for ppc32. Lowering of scalars and
aggregates is supported. Complex numbers are not.
This adds va_args support for PowerPC (32 bit) to clang.
Implement vaarg lowering for ppc32. Lowering of scalars and
aggregates is supported. Complex numbers are not.
This adds va_args support for PowerPC (32 bit) to clang.
Reviewed by: jhibbits
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D1308
Divacky):
Introduce CPUStringIsValid() into MCSubtargetInfo and use it for ARM
.cpu parsing.
Previously .cpu directive in ARM assembler didnt switch to the new
CPU and therefore acted as a nop. This implemented real action for
.cpu and eg. allows to assembler FreeBSD kernel with -integrated-as.
Change the name to be in style.
Add a FIXME as requested by Renato Golin.
arm asm: Let .fpu enable instructions, PR20447.
I'm not very happy with duplicating the fpu->feature mapping in ARMAsmParser.cpp
and in clang's driver. See the bug for a patch that doesn't do that, and the
review thread [1] for why this duplication exists.
1: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140811/231052.html
This makes the .fpu directive work properly, so we can successfully
assemble several .S files using the directive, under lib/libc/arm.
Allow CP10/CP11 operations on ARMv5/v6
Those registers are VFP/NEON and vector instructions should be used instead,
but old cores rely on those co-processors to enable VFP unwinding. This change
was prompted by the libc++abi's unwinding routine and is also present in many
legacy low-level bare-metal code that we ought to compile/assemble.
Fixing bug PR20025 and allowing PR20529 to proceed with a fix in libc++abi.
This enables assembling certain ARM instructions used in libgcc.
Revert "Added inst combine transforms for single bit tests from Chris's note"
This reverts commit r210006, it miscompiled libapr which is used in who
knows how many projects.
A test has been added to ensure that we don't regress again.
This fixes a miscompilation in libapr, which caused problems in svnlite.
Fix some semantic usability issues with DynamicLibrary.
This patch allows invalid DynamicLibrary instances to be
constructed, and fixes the const-correctness of the isValid()
method.
No functional change.
This is needed for supporting the upgrade to a newer LLDB snapshot.
AArch64: add support for dynamic-loader relocations
LLD needs them, and it's good to be able to print them properly when
our object dumpers encounter them.
Patch by Daniel Stewart.
This is needed for supporting the upgrade to a newer LLDB snapshot.
Re-apply previously reverted changes to restore LLDB to parity with
the last update as of upstream revision 202189. This is the first step
an LLDB update to correspond with the Clang 3.5 import and re-applies
the following upstream revisions:
SVN git
199408 3ad0a1a1
199689 05be72c3
200085 9ad47a93
Sponsored by: DARPA, AFRL
which gcc in base cannot handle. Replace these with C++98 equivalents.
While here, add the patch for the adapted fix.
Reported by: bz, kib
Pointy hat to: dim
MFC after: 1 week
X-MFC-With: r274442
Totally forget deallocated SDNodes in SDDbgInfo.
What would happen before that commit is that the SDDbgValues associated with
a deallocated SDNode would be marked Invalidated, but SDDbgInfo would keep
a map entry keyed by the SDNode pointer pointing to this list of invalidated
SDDbgNodes. As the memory gets reused, the list might get wrongly associated
with another new SDNode. As the SDDbgValues are cloned when they are transfered,
this can lead to an exponential number of SDDbgValues being produced during
DAGCombine like in http://llvm.org/bugs/show_bug.cgi?id=20893
Note that the previous behavior wasn't really buggy as the invalidation made
sure that the SDDbgValues won't be used. This commit can be considered a
memory optimization and as such is really hard to validate in a unit-test.
This should fix abnormally large memory usage and resulting OOM crashes
when compiling certain ports with debug information.
Reported by: Dmitry Marakasov <amdmi3@amdmi3.ru>
Upstream PRs: http://llvm.org/PR19031http://llvm.org/PR20893
MFC after: 1 week
AsmParser: Disable Darwin-style macro argument expansion on non-darwin targets.
There is code in the wild that relies on $0 not being expanded.
This fixes some cases of using $ signs in literals being incorrectly
assembled.
Reported by: Richard Henderson
Upstream PR: http://llvm.org/PR21500
MFC after: 3 days
to apply with the same patch options onto a fresh upstream llvm/clang
3.4.1 checkout, and use approximately the same header tempate for them.
MFC after: 3 days
Set trunc store action to Expand for all X86 targets.
When compiling without SSE2, isTruncStoreLegal(F64, F32) would return
Legal, whereas with SSE2 it would return Expand. And since the Target
doesn't seem to actually handle a truncstore for double -> float, it
would just output a store of a full double in the space for a float
hence overwriting other bits on the stack.
Patch by Luqman Aden!
This should fix clang -O0 on i386 assigning garbage to floats, in
certain scenarios.
PR: 187437
Submitted by: cebd@gmail.com
Obtained from: http://llvm.org/viewvc/llvm-project?rev=217410&view=rev
MFC after: 3 days
Debug info: fix a crash when emitting IndirectFieldDecls, which were
previously not handled at all.
rdar://problem/16348575
MFC after: 1 week
Sponsored by: DARPA, AFRL
Note that r271282 contains only the src change from Clang rev 200797.
This patch file includes two follow-on changes to the test case, which
do not apply to the copy in the FreeBSD tree.
Upstream Clang revisions:
200797:
Debug info: fix a crasher when when emitting debug info for
not-yet-completed templated types. getTypeSize() needs a complete type.
rdar://problem/15931354
200798:
Simplify testcase from r200797 some more.
200805:
Further simplify r200797 and add an explanatory comment.
PR: 193347
MFC after: 3 days
Sponsored by: DARPA, AFRL
Debug info: fix a crasher when when emitting debug info for
not-yet-completed templated types. getTypeSize() needs a complete type.
rdar://problem/15931354
PR: 193347
MFC after: 3 days
Sponsored by: DARPA, AFRL
Reverts 271025 but still functionally patches it. Original intent is still
the same. Pointed out by rdivacky.
MFV: Only emit movw on ARMv6T2
Building for the FreeBSD default target ARMv6 was emitting movw ASM on certain
test cases (found building qmake4/5 for ARM). Don't do that, moreover, the AS
in base doesn't understand this instruction for this target. One would need
to use --integrated-as to get this to build if desired.
http://llvm.org/viewvc/llvm-project?view=revision&revision=216989
Submitted by: ian
Reviewed by: dim
Obtained from: llvm.org
MFC after: 2 days
Relnotes: yes
Building for the FreeBSD default target ARMv6 was emitting movw ASM on certain
test cases (found building qmake4/5 for ARM). Don't do that, moreover, the AS
in base doesn't understand this instruction for this target. One would need
to use --integrated-as to get this to build if desired.
http://llvm.org/viewvc/llvm-project?view=revision&revision=216989
Submitted by: ian
Reviewed by: dim
Obtained from: llvm.org
MFC after: 2 days
Readline is no longer installed after r268461. A readline compatibility
header is provided by libedit, but readline definitions do not seem to
be used by LLDB anyhow.
Submitted by: markj, Jan Beich
[PPC64] Fix PR20071 (fctiduz generated for targets lacking that
instruction)
PR20071 identifies a problem in PowerPC's fast-isel implementation
for floating-point conversion to integer. The fctiduz instruction
was added in Power ISA 2.06 (i.e., Power7 and later). However, this
instruction is being generated regardless of which 64-bit PowerPC
target is selected.
The intent is for fast-isel to punt to DAG selection when this
instruction is not available. This patch implements that change.
For testing purposes, the existing fast-isel-conversion.ll test adds
a RUN line for -mcpu=970 and tests for the expected code generation.
Additionally, the existing test fast-isel-conversion-p5.ll was found
to be incorrectly expecting the unavailable instruction to be
generated. I've removed these test variants since we have adequate
coverage in fast-isel-conversion.ll.
This is needed to compile clang with debug+asserts on older powerpc64
and ppc970 targets.
Requested by: jhibbits
MFC after: 3 days
Legalizer: Add support for splitting insert_subvectors.
We handle this by spilling the whole thing to the stack and doing the
insertion as a store.
PR19492. This happens in real code because the vectorizer creates
v2i128 when AVX is enabled.
This fixes a "fatal error: error in backend: Do not know how to split
the result of this operator!" message encountered during compilation of
the net-p2p/libtorrent-rasterbar port.
Reported by: Evgeniy <iron@mail.ua>
MFC after: 3 days
Fix a bug in xmmintrin.h.
The last step of _mm_cvtps_pi16 should use _mm_packs_pi32, which is a function
that reads two __m64 values and packs four 32-bit values into four 16-bit
values.
<rdar://problem/16873717>
MFC after: 3 days
Implement a new -fstandalone-debug option. rdar://problem/15685848
It controls everything that -flimit-debug-info used to, plus the
vtable type optimization. The old -fno-limit-debug-info option is now an
alias to -fstandalone-debug and vice versa.
Standalone is the default on Darwin until dtrace is updated to work with
non-standalone debug info (rdar://problem/15758808).
Note: I kept the LimitedDebugInfo name in CodeGenOptions::DebugInfoKind
because NoStandaloneDebugInfo sounded even more confusing.
Debug info: Generate debug info for variadic functions.
Paired commit with LLVM.
rdar://problem/13690847
This merege includes changes to use the Clang 3.4 API (revisions
199686 and 200082) in lib/CodeGen/CGDebugInfo.cpp:
getParamType -> getArgType
getNumParams -> getNumArgs
getReturnType -> getResultType
Sponsored by: DARPA, AFRL
Debug info: Support variadic functions.
Variadic functions have an unspecified parameter tag after the last
argument. In IR this is represented as an unspecified parameter in the
subroutine type.
Paired commit with CFE r202185.
rdar://problem/13690847
This re-applies r202184 + a bugfix in DwarfDebug's argument handling.
This merge includes a change to use the LLVM 3.4 API in
lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp:
DwarfUnit -> CompileUnit
Sponsored by: DARPA, AFRL
all FreeBSD versions, not just 10.x and earlier. Apparently too many
people seem to have trouble with post-1993 formats.
Also remove the related notes about messing with kernel configuration
files from UPDATING, which are now superfluous.
Requested by: many
MFC after: 3 days
We previously sent SIGKILL to the debuggee in DoDestroy, but did not
actually detach or kill via ptrace. It seems that this somehow didn't
matter on Linux, but did on FreeBSD.
This would happen when quitting LLDB while stopped at a breakpoint, for
example. The debuggee remained stopped in ptrace (with the signal
either pending or lost). After a timeout of a second or two LLDB exits,
which caused the debuggee to resume and dump core from an unhandled
SIGTRAP.
BringProcessIntoLimbo is a poorly named wrapper for ptrace(PT_KILL)
which is the desired behaviour from DoDestroy.
http://llvm.org/pr18894
Sponsored by: DARPA, AFRL
applied to our copy of llvm/clang. These can be applied in alphabetical
order to a pristine llvm/clang 3.4 release source tree, to result in the
same version used in FreeBSD.
This is intended to clearly document all the changes until now, which
mostly consist of cherry pickings from the respective upstream trunks,
plus a number of hand-written FreeBSD-specific ones. Hopefully those
can eventually be cleaned up and sent upstream too.
MFC after: 1 week
X-MFC-With: r263313
ISel: Make VSELECT selection terminate in cases where the condition type has to
be split and the result type widened.
When the condition of a vselect has to be split it makes no sense widening the
vselect and thereby widening the condition. We end up in an endless loop of
widening (vselect result type) and splitting (condition mask type) doing this.
Instead, split both the condition and the vselect and widen the result.
I ran this over the test suite with i686 and mattr=+sse and saw no regressions.
Fixes PR18036.
With this fix the original problem case from the graphics/rawtherapee
port (posted in http://llvm.org/PR18036 ) now compiles within ~97MB RSS.
Reported by: mandree
MFC after: 1 week
Reland "Fix miscompile of MS inline assembly with stack realignment"
This re-lands commit r196876, which was reverted in r196879.
The tests have been fixed to pass on platforms with a stack alignment
larger than 4.
Update to clang side tests will land shortly.
Pull in r196986 from upstream llvm trunk (by Reid Kleckner):
Revert the backend fatal error from r196939
The combination of inline asm, stack realignment, and dynamic allocas
turns out to be too common to reject out of hand.
ASan inserts empy inline asm fragments and uses aligned allocas.
Compiling any trivial function containing a dynamic alloca with ASan is
enough to trigger the check.
XFAIL the test cases that would be miscompiled and add one that uses the
relevant functionality.
Pull in r202930 from upstream llvm trunk (by Hans Wennborg):
Check for dynamic allocas and inline asm that clobbers sp before building
selection dag (PR19012)
In X86SelectionDagInfo::EmitTargetCodeForMemcpy we check with MachineFrameInfo
to make sure that ESI isn't used as a base pointer register before we choose to
emit rep movs (which clobbers esi).
The problem is that MachineFrameInfo wouldn't know about dynamic allocas or
inline asm that clobbers the stack pointer until SelectionDAGBuilder has
encountered them.
This patch fixes the problem by checking for such things when building the
FunctionLoweringInfo.
Differential Revision: http://llvm-reviews.chandlerc.com/D2954
Together, these commits fix the problem encountered in the devel/emacs
port on the i386 architecture, where a combination of stack realignment,
alloca() and memcpy() could incidentally clobber the %esi register,
leading to segfaults in the temacs build-time utility.
See also: http://llvm.org/PR18171 and http://llvm.org/PR19012
Reported by: ashish
PR: ports/183064
MFC after: 1 week
in clang's InitHeaderSearch.cpp. This has been superseded by David
Chisnall's commit in r255321.
Moreover, if libc++ is used, the libstdc++ include directories should
not be in the search path at all. These directories are now only used
if you pass -stdlib=libstdc++.
MFC after: 3 days
X-MFC-With: r261991
was silently broken by upstream for a Windows-specific use-case.
Apparently some versions of CMake still rely on this archaic feature...
Reported by: rakuco
MFC after: 3 days
X-MFC-With: r261991
Don't produce an alias between destructors with different calling conventions.
Fixes pr19007.
(Please note that is an LLVM PR identifier, not a FreeBSD one.)
This should fix Firefox and/or libxul crashes (due to problems with
regparm/stdcall calling conventions) on i386.
Reported by: multiple users on freebsd-current
PR: bin/187103
MFC after: 1 week
Fix a crash that occurs when PWD is invalid.
MCJIT needs to be able to run in hostile environments, even when PWD
is invalid. There's no need to crash MCJIT in this case.
The obvious fix is to simply leave MCContext's CompilationDir empty
when PWD can't be determined. This way, MCJIT clients,
and other clients that link with LLVM don't need a valid working directory.
If we do want to guarantee valid CompilationDir, that should be done
only for clients of getCompilationDir(). This is as simple as checking
for an empty string.
The only current use of getCompilationDir is EmitGenDwarfInfo, which
won't conceivably run with an invalid working dir. However, in the
purely hypothetically and untestable case that this happens, the
AT_comp_dir will be omitted from the compilation_unit DIE.
This should help fix assertions occurring with ports-mgmt/tinderbox,
when it is using jails, and sometimes invalidates clang's current
working directory.
Reported by: decke
MFC after: 2 weeks
X-MFC-With: r261991
Lower FNEG just like FABS to fneg[ds] and fmov[ds], thus avoiding
expensive libcall. Also, Qp_neg is not implemented on at least
FreeBSD. This is also what gcc is doing.
Give sparcv9 the ability to set the target cpu. Change it from
accepting -march which doesnt exist on sparc gcc to -mcpu. While here
adjust a few tests to not write an unused temporary file.
Highlights include (upstream revs in parens):
- Improvements to the remote GDB protocol client
(r196610, r197579, r197857, r200072, and others)
- Bug fixes for big-endian targets
(r196808)
- Initial support for libdispatch (GCD) queues in the debuggee
(r197190)
- Add "step-avoid-libraries" setting
(r199943)
- IO subsystem improvements (including initial work on a curses gui)
(r200263)
- Support hardware watchpoints on FreeBSD
(r201706)
- Improved unwinding through hand-written assembly functions
(r201839)
- Handle DW_TAG_unspecified_parameters for variadic functions
(r202061)
- Fix Ctrl+C interrupting a running inferior process
(r202086, r202154)
- Various bug fixes for memory leaks, LLDB segfaults, the C++ demangler,
ELF core files, DWARF debug info, and others.
Sponsored by: DARPA, AFRL
Highlights include:
- Support hardware watchpoints on FreeBSD
(r201706)
- Improved unwinding through hand-written assembly functions
(r201839)
- Handle DW_TAG_unspecified_parameters for variadic functions
(r202061)
- Fix Ctrl+C interrupting a running inferior process
(r202086, r202154)
- Various bug fixes, including to the remote GDB protocol client
Sponsored by: DARPA, AFRL
Implement getDwarfEHStackPointer() and initDwarfEHRegSizeTable() for sparcv9.
Enables the libgcc-specific undocumented __builtin_dwarf_sp_column() and
__builtin_init_dwarf_reg_size_table() builtins to be compiled for
sparc64.
SPARC: Implement TRAP lowering. Matches what GCC emits.
This lets clang emit "ta 5" for trap instructions on sparc64, instead of
emitting a call to abort(), making it possible to link the kernel.