Currently LLVM is more or less set up to use ELFv2, but it still defaults to
ELFv1 in some places. This causes lld to generate broken binaries when used
with LTO.
PR: 269455
Approved by: dim
MFC after: 3 days
This updates llvm, clang, compiler-rt, libc++, libunwind, lld, lldb and
openmp to llvmorg-15-init-17826-g1f8ae9d7e7e4, the last commit before
the upstream release/16.x branch was created.
PR: 265425
MFC after: 2 weeks
Merge commit 6710b21d4698 from llvm git (by Kai Luo):
[PowerPC] Allow llvm.ppc.cfence to accept pointer types
In the context of atomic load, integer, pointer and float point types are allowed, thus we should allow llvm.ppc.cfence to accept any type mentioned.
Fixes https://github.com/llvm/llvm-project/issues/55983.
Reviewed By: shchenz, vchuravy
Differential Revision: https://reviews.llvm.org/D127554
Requested by: jhibbits
MFC after: 3 days
Merge commit 307ace7f20d5 from llvm git (by David Sherwood):
[LoopVectorize] Ensure the VPReductionRecipe is placed after all it's inputs
When vectorising ordered reductions we call a function
LoopVectorizationPlanner::adjustRecipesForReductions to replace the
existing VPWidenRecipe for the fadd instruction with a new
VPReductionRecipe. We attempt to insert the new recipe in the same
place, but this is wrong because createBlockInMask may have
generated new recipes that VPReductionRecipe now depends upon. I
have changed the insertion code to append the recipe to the
VPBasicBlock instead.
Added a new RUN with tail-folding enabled to the existing test:
Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll
Differential Revision: https://reviews.llvm.org/D129550
Reported by: yuri
PR: 264834
MFC after: 3 days
Merge llvm review D77558, by Justin Hibbits:
PowerPC: Don't hoist float multiply + add to fused operation on SPE
SPE doesn't have a fmadd instruction, so don't bother hoisting a
multiply and add sequence to this, as it'd become just a library call.
Hoisting happens too late for the CTR usability test to veto using the CTR
in a loop, and results in an assert "Invalid PPC CTR loop!".
Reported by: alfredo
Obtained from: https://reviews.llvm.org/D77558
MFC after: 3 days
Merge commit 88ce403c6aab from llvm git (by Florian Hahn):
[LV] Add new block to place recurrence splice, if needed.
In some cases, a recurrence splice instructions needs to be inserted
between to regions, for example if the regions get re-arranged during
sinking.
Fixes#56146.
PR: 264979
Reported by: Robert Clausecker <fuz@fuz.su>
MFC after: 3 days
Merge commit e8305c0b8f49 from llvm git (by Simon Pilgrim)
[X86] combineX86ShuffleChain - don't fold to truncate(concat(V1,V2)) if it was already a PACK op
Fixes#55050
PR: 264394
Reported by: VVD <vvd@unislabs.com>
MFC after: 3 days
Merge commit d9d15af7873f from llvm git (Qiu Chaofan):
[PowerPC] Treat llvm.fmuladd intrinsic as using CTR
This fixes bug 55463, similar to D78668. This is a temporary fix since
we will switch to post-isel CTR loop determination in the future.
Reviewed By: dim, shchenz
Differential Revision: https://reviews.llvm.org/D125746
MFC after: 2 weeks
Merge commit 027c16bef4b7 from llvm git (by Nick Desaulniers):
[X86ISelLowering] permit BlockAddressSDNode "i" constraints for PIC
When building 32b x86 code as PIC, the existing handling of "i"
constraints is conservative since generally we have to go through the
GOT to find references to functions.
But generally, BlockAddresses from C code refer to the Function in the
current TU. Permit BlockAddresses to be used with the "i" constraint
for those cases.
I regressed this in
commit 4edb9983cb8c ("[SelectionDAG] treat X constrained labels as i for asm")
Fixes: https://github.com/llvm/llvm-project/issues/53868
Reviewed By: efriedma, MaskRay
Differential Revision: https://reviews.llvm.org/D119905
This updates llvm, clang, compiler-rt, libc++, libunwind, lld, lldb and
openmp to llvmorg-14-init-18294-gdb01b123d012, the last commit before
the upstream release/14.x branch was created.
PR: 261742
MFC after: 2 weeks
In contrast to Linux it does not provide entries which can be readlinked
-- these are just regular files, not giving the expected outcome. That's
on top of procfs not being mounted by default to begin with.
Reviewed by: dim
Differential Revision: https://reviews.freebsd.org/D34684
After merging llvm commit b9ca73e1a8fd for PR 262608, it would fail to
compile with:
/usr/src/contrib/llvm-project/llvm/lib/IR/Operator.cpp:197:22: error: no member named 'isZero' in 'llvm::APInt'
if (!IndexedSize.isZero()) {
~~~~~~~~~~~ ^
Upstream refactored their APInt class, and isZero() was one of the newer
methods which did not yet exist in llvm 13.0.0. Fix this by using the
older but equivalent isNullValue() method instead.
Fixes: 1b3bef43e3
MFC after: 3 days
Merge commit b9ca73e1a8fd from llvm git (by Stephen Tozer):
[DebugInfo] Correctly handle arrays with 0-width elements in GEP salvaging
Fixes an issue where GEP salvaging did not properly account for GEP
instructions which stepped over array elements of width 0 (effectively a
no-op). This unnecessarily produced long expressions by appending
`... + (x * 0)` and potentially extended the number of SSA values used
in the dbg.value. This also erroneously triggered an assert in the
salvage function that the element width would be strictly positive.
These issues are resolved by simply ignoring these useless operands.
Reviewed By: aprantl
Differential Revision: https://reviews.llvm.org/D111809
PR: 262608
Reported by: Damjan Jovanovic <damjan.jov@gmail.com>
MFC after: 3 days
Merge commit c7c84b90879f from llvm git (by Adrian Prantl):
[DwarfDebug] Refuse to emit DW_OP_LLVM_arg values wider than 64 bits
DwarfExpression::addUnsignedConstant(const APInt &Value) only supports
wider-than-64-bit values when it is used to emit a top-level DWARF
expression representing the location of a variable. Before this change,
it was possible to call addUnsignedConstant on >64 bit values within a
subexpression when substituting DW_OP_LLVM_arg values.
This can trigger an assertion failure (e.g. PR52584, PR52333) when it
happens in a fragment (DW_OP_LLVM_fragment) expression, as
addUnsignedConstant on >64 bit values splits the constant into separate
DW_OP_pieces, which modifies DwarfExpression::OffsetInBits.
This change papers over the assertion errors by bailing on overly wide
DW_OP_LLVM_arg values. A more comprehensive fix might be to be to split
wide values into pointer-sized fragments.
[0] https://github.com/llvm/llvm-project/blob/e71fa03/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp#L799-L805
Patch by Ricky Zhou!
Differential Revision: https://reviews.llvm.org/D115343
MFC after: 3 days
Merge commit 77e8f4eeeeed from llvm git (by David Green):
[ARM] Define ComplexPatternFuncMutatesDAG
Some of the Arm complex pattern functions call canExtractShiftFromMul,
which can modify the DAG in-place. For this to be valid and handled
successfully we need to define ComplexPatternFuncMutatesDAG.
Differential Revision: https://reviews.llvm.org/D107476
When building parts of llvm targeting armv6 on stable/12, the following
assertion can appear (or if assertions are disabled, clang is likely to
crash):
Assertion failed: (NodeToMatch->getOpcode() != ISD::DELETED_NODE && "NodeToMatch was removed partway through selection"), function SelectCodeCommon, file /usr/src/contrib/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp, line 3573.
PLEASE submit a bug report to https://bugs.freebsd.org/submit/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0. Program arguments: /usr/obj/usr/src/freebsd12-amd64/tmp/usr/bin/c++ -cc1 -triple armv6kz-unknown-freebsd12.3-gnueabihf -S --mrelax-relocations -disable-free -disable-llvm-verifier -discard-value-names -mrelocation-model static -mconstructor-aliases -target-cpu arm1176jzf-s -target-feature +vfp2 -target-feature +vfp2sp -target-feature -vfp3 -target-feature -vfp3d16 -target-feature -vfp3d16sp -target-feature -vfp3sp -target-feature -fp16 -target-feature -vfp4 -target-feature -vfp4d16 -target-feature -vfp4d16sp -target-feature -vfp4sp -target-feature -fp-armv8 -target-feature -fp-armv8d16 -target-feature -fp-armv8d16sp -target-feature -fp-armv8sp -target-feature -fullfp16 -target-feature +fp64 -target-feature -d32 -target-feature -neon -target-feature -sha2 -target-feature -aes -target-feature -fp16fml -target-feature +strict-align -target-abi aapcs-linux -mfloat-abi hard -fallow-half-arguments-and-returns -ffunction-sections -fdata-sections -O1 -std=c++14 -fdeprecated-macro -fno-rtti -fno-signed-char -faddrsig -fexperimental-new-pass-manager PPCISelLowering-009095.ii
1. <eof> parser at end of file
2. Code generation
3. Running pass 'Function Pass Manager' on module 'PPCISelLowering-009095.cpp'.
4. Running pass 'ARM Instruction Selection' on function '@_ZN4llvm17PPCTargetLoweringC2ERKNS_16PPCTargetMachineERKNS_12PPCSubtargetE'
This crash or assertion is fixed by the upstream commit.
MFC after: 3 days
Merge commit e5a8af7a90c6 from llvm git (by Gulfem Savrun Yeniceri):
[Passes] Fix relative lookup table converter pass
This patch fixes the relative table converter pass for the lookup table
accesses that are resulted in an instruction sequence, where gep is not
immediately followed by a load, such as gep being hoisted outside the loop
or another instruction is inserted in between them. The fix inserts the
call to load.relative.instrinsic in the original place of load instead of gep.
Issue is reported by FreeBSD via https://bugs.freebsd.org/259921.
Differential Revision: https://reviews.llvm.org/D115571
PR: 259921
Reported by: O. Hartmann <freebsd@walstatt-de.de>
MFC after: 3 days
Merge commit e27a6db5298f from llvm git (by Jameson Nash):
Bad SLPVectorization shufflevector replacement, resulting in write to wrong memory location
We see that it might otherwise do:
%10 = getelementptr {}**, <2 x {}***> %9, <2 x i32> <i32 10, i32 4>
%11 = bitcast <2 x {}***> %10 to <2 x i64*>
...
%27 = extractelement <2 x i64*> %11, i32 0
%28 = bitcast i64* %27 to <2 x i64>*
store <2 x i64> %22, <2 x i64>* %28, align 4, !tbaa !2
Which is an out-of-bounds store (the extractelement got offset 10
instead of offset 4 as intended). With the fix, we correctly generate
extractelement for i32 1 and generate correct code.
Differential Revision: https://reviews.llvm.org/D106613
Merge commit 029f1a534489 from llvm git (by Arthur Eubanks):
[LazyCallGraph] Skip blockaddresses
blockaddresses do not participate in the call graph since the only
instructions that use them must all return to someplace within the
current function. And passes cannot retrieve a function address from a
blockaddress.
This was suggested by efriedma in D58260.
Fixes PR50881.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D112178
Merge commit f5755c0849a5 from llvm git (by Jessica Clarke):
[Mips] Add glue between CopyFromReg, CopyToReg and RDHWR nodes for TLS
The MIPS ABI requires the thread pointer be accessed via rdhwr $3, $r29.
This is currently represented by (CopyToReg $3, (RDHWR $29)) followed by
a (CopyFromReg $3). However, there is no glue between these, meaning
scheduling can break those apart. In particular, PR51691 is a report
where PseudoSELECT_I was moved to between the CopyToReg and CopyFromReg,
and since its expansion uses branches, it split the def and use of the
physical register between two basic blocks, resulting in the def being
eliminated and the use having no def. It also seems possible that a
similar situation could arise splitting up the CopyToReg from the RDHWR,
causing the RDHWR to use a destination register other than $3, violating
the ABI requirement.
Thus, add glue between all three nodes to ensure they aren't split up
during instruction selection. No regression test is added since any test
would be implictly relying on specific scheduling behaviour, so whilst
it might be testing that glue is preventing reordering today, changes to
scheduling behaviour could result in the test no longer being able to
catch a regression here, as the reordering might no longer happen for
other unrelated reasons.
Fixes PR51691.
Reviewed By: atanasyan, dim
Differential Revision: https://reviews.llvm.org/D111967
This updates llvm, clang, compiler-rt, libc++, libunwind, lld, lldb and
openmp to llvmorg-13-init-16847-g88e66fa60ae5, the last commit before
the upstream release/13.x branch was created.
PR: 258209
MFC after: 2 weeks
Merge commit 2d8c18fbbdd1 from llvm git (by Jessica Clarke):
[X86] Don't add implicit REP prefix to VIA PadLock xstore
Commit 8fa3e8fa1492 added an implicit REP prefix to all VIA PadLock
instructions, but GNU as doesn't add one to xstore, only all the others.
This resulted in a kernel panic regression in FreeBSD upon updating to
LLVM 11 (https://bugs.freebsd.org/259218) which includes the commit in
question. This partially reverts that commit.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D112355
MFC after: 3 days
Reverts llvm commit 42eaf4fe0adef3344adfd9fbccd49f325cb549ef, pointed
from bisect as source of regression that causes liblzma to compress/
uncompress incorrectly. It's know to affect powerpc64 BE only.
The patch unbreaks FreeBSD powerpc64 installation media, since
bsdinstall can't uncompress the *.txz produced by FreeBSD CI. It's
probably miscompiling other software bas well.
Upstream PR: https://bugs.llvm.org/show_bug.cgi?id=51714
Reviewed by: dim
MFC after: 2 days
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D31804
Amends LLVM commit 2518433f861fcb877d0a7bdd9aec1aec1f77505a that
was pointed as the source of regression on LLVM12.
This affects powerpc64*, making binaries crash with segmentation fault
due to bad code generation around "__stack_chk_guard"
Root cause and/or proper fix is under investigation by:
https://bugs.llvm.org/show_bug.cgi?id=51590
Reviewed by: dim
MFC after: 2 days
Sponsored by: Instituto de Pesquisas Eldorado (eldorado.org.br)
Differential Revision: https://reviews.freebsd.org/D31698
Merge commit 789708617d20 from llvm git (Koutheir Attouchi):
Do not generate calls to the 128-bit function __multi3() on 32-bit ARM
Re-applying this patch after bots failures. Should be fine now.
The function __multi3() is undefined on 32-bit ARM, so a call to it should
never be emitted. Instead, plain instructions need to be generated to
perform 128-bit multiplications.
Differential Revision: https://reviews.llvm.org/D103906
Reported by: mmel
MFC after: 3 days