Merge the projects/clang-sparc64 branch back to head. This brings in
several updates from the llvm and clang trunks to make the sparc64
backend fully functional.
Apart from one patch to sys/sparc64/include/pcpu.h which is still under
discussion, this makes it possible to let clang fully build world and
kernel for sparc64.
Any assistance with testing this on actual sparc64 hardware is greatly
appreciated, as there will unavoidably be bugs left.
Many thanks go to Roman Divacky for his upstream work on getting the
sparc64 backend into shape.
MFC r262985:
Repair a few minor mismerges from r262261 in the clang-sparc64 project
branch. This is also to minimize differences with upstream.
This is mostly a no-op other than for ARM where it adds missing
__aeabi_mem* and __aeabi_*divmod functions. Even on ARM these will remain
unused until the rest of the ARM EABI code is merged.
compiler frame size used there so this whole thing is V8/V9-agnostic.
- Use 32-bit function alignment as GCC does when using UltraSPARC I or
higher optimizations.
- Don't waste delay slots when possible.
Unfortunately, this still doesn't make libcompiler_rt a viable replacement
for libgcc on sparc64 though as once installed instead, buildworld times
increase by nearly 60% (which isn't related to these assembler functions).
This version is similar to the code shipped with libgcc. It is based on
the code from the SPARC64 architecture manual, provided without any
restrictions.
Tested by: flo@
SPARC and MIPS CPUs don't have special instructions to count
leading/trailing zeroes. The compiler-rt library provides fallback
rountines for these. The 64-bit routines, __clzdi2 and __ctzdi2, are
implemented as simple wrappers around the compiler built-in
__builtin_clz(), assuming these will expand to either 32-bit
CPU instructions or calls to __clzsi2 and __ctzsi2.
Unfortunately, our GCC 4.2 probably thinks that because the operand is
stored in a 64-bit register, it might just be a better idea to invoke
its 64-bit equivalent, simply resulting into endless recursion. Fix this
by defining __builtin_clz and __builtin_ctz to __clzsi2 and __ctzsi2
explicitly.
This version of libcompiler_rt adds support for __mulo[sdt]i4(), which
computes a multiply and its overflow flag. There are also a lot of
cleanup fixes to headers that don't really affect us.
Updating to this revision should make it a bit easier to contribute
changes back to the LLVM developers.
It seems there have only been a small amount to the compiler-rt source
code in the mean time. I'd rather have the code in sync as much as
possible by the time we release 9.0. Changes:
- The libcompiler_rt library is now dual licensed under both the
University of Illinois "BSD-Like" license and the MIT license.
- Our local modifications for using .hidden instead of .private_extern
have been upstreamed, meaning our changes to lib/assembly.h can now be
reverted.
- A possible endless recursion in __modsi3() has been fixed.
- Support for ARM EABI has been added, but it has no effect on FreeBSD
(yet).
- The functions __udivmodsi4 and __divmodsi4 have been added.
Requested by: many, including bf@ and Pedro Giffuni
Not doing so may cause all sorts of random libraries to expose
libcompiler_rt's functions, which should of course not be done.
Discussed with: kan, kib