Four bytes of padding are needed in the regular powerpc case to bring the
stack frame size up to a multiple of 16 bytes to meet ABI requirements.
Fixes odd hangs I was encountering during testing.
Summary:
We need to save off the full 64-bit register, not just the low 32 bits,
of all registers getting saved off in _rtld_bind_start. Additionally,
we need to save off the other SPE registers (SPEFSCR and accumulator),
so that their program state is not affected by the PLT resolver.
Reviewed by: bdragon
Differential Revision: https://reviews.freebsd.org/D22520
Alter bsd.compat.mk to set MACHINE and MACHINE_ARCH when included
directly so MD paths in Makefiles work. In the process centralize
setting them in LIBCOMPATWMAKEENV.
Alter .PATH and CFLAGS settings in work when the Makefile is included.
While here only support LIB32 on supported platforms rather than always
enabling it and requiring users of MK_LIB32 to filter based
TARGET/MACHINE_ARCH.
The net effect of this change is to make Makefile.libcompat only build
compatability libraries.
Changes relative to r354449:
Correct detection of the compiler type when bsd.compat.mk is used
outside Makefile.libcompat. Previously it always matched the clang
case.
Set LDFLAGS including the linker emulation for mips where -m32 seems to
be insufficent.
Reviewed by: imp, kib (origional version in r354449)
Obtained from: CheriBSD (conceptually)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D22251
Alter bsd.compat.mk to set MACHINE and MACHINE_ARCH when included
directly so MD paths in Makefiles work. In the process centralize
setting them in LIBCOMPATWMAKEENV.
Alter .PATH and CFLAGS settings in work when the Makefile is included.
While here only support LIB32 on supported platforms rather than always
enabling it and requiring users of MK_LIB32 to filter based
TARGET/MACHINE_ARCH.
The net effect of this change is to make Makefile.libcompat only build
compatability libraries.
Reviewed by: imp, kib
Obtained from: CheriBSD (conceptually)
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D22251
After the aux vector is moved, it is necessary to re-digest aux_info so the
pointers are updated to the new locations.
This was causing thread creation to fail on powerpc64 when using direct
execution due to a nonsense value being read for aux_info[AT_STACKPROT].
Approved by: jhibbits (mentor)
Differential Revision: https://reviews.freebsd.org/D21656
In the past, this allocator seems to have allocated things larger than
a page seperately. Much of this code was removed at some point (perhaps
along with sbrk() used) so remove the rest. Instead, keep allocating in
power-of-two bins up to FIRST_BUCKET_SIZE << (NBUCKETS - 1). If we want
something more efficent, we should use a fancier allocator.
While here, remove some vestages of sbrk() use. Most importantly, don't
try to page align the pagepool since it's always page aligned by mmap().
Reviewed by: kib
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D21453
In Seventh Edition UNIX, the last pointer passed to free() was
guaranteed to not actually have been freed allowing memory to be
"compacted" via the following pattern:
free(foo);
foo = realloc(foo, newsize);
Further, Andrew Koenig reports in "C Traps and Pitfalls" that the
original realloc() implementation required this pattern.
The C standard is clear that this is Undefined Behavior. Modern
allocators don't support it and no portable code could rely on it so
remove this support.
Note: the removed implementation contains an off-by-one error and if
an item isn't found on the freelist, then twice as much memory as the
largest possible allocation will be copied.
Reviewed by: kib, imp
Obtained from: CheriBSD
Sponsored by: DARPA, AFRL
Differential Revision: https://reviews.freebsd.org/D21296
Instead of restoring the saved values of argc, argv and envp,
these must be loaded from the stack that _rtld() modifies.
This fixes rtld direct exec mode.
E.g.: /libexec/ld-elf.so.1 /bin/ls
Reviewed by: jhibbits
Differential Revision: https://reviews.freebsd.org/D21131
First, amd64 version of the script cannot work at least due to the
wrong architecture specification. Second, kernel can activate shared
objects for long time, due to PIE support.
It seems the intent was to allow ld-elf.so.1 to be build and used as
an executable. Since we have direct exec mode implemented for dso
ld-elf.so.1, the non-functional and commented out scripts can be
finally removed.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
I found this on one of the CheriBSD Jenkins builders. Using
beforelinking instead of ${PROG} should fix the dependency for the
DEBUG_FILES case.
Reviewed by: brooks
Currently RTLD is linked against libc_nossp_pic which means that any libc
symbol used in rtld can pull in a lot of depedencies. This was causing
symbol such as __libc_interposing and all the pthread stubs to be included
in RTLD even though they are not required. It turns out most of these
dependencies can easily be avoided by providing overrides inside of rtld.
This change is motivated by CHERI, where we have an experimental ABI that
requires additional relocation processing to allow the use of function
pointers inside of rtld. Instead of adding this self-relocation code to
RTLD I attempted to remove most function pointers from RTLD and discovered
that most of them came from the libc dependencies instead of being actually
used inside rtld.
A nice side-effect of this change is that rtld is now 22% smaller on amd64.
text data bss dec hex filename
0x21eb6 0xce0 0xe60 145910 239f6 /home/alr48/ld-elf-x86.before.so.1
0x1a6ed 0x728 0xdd8 113645 1bbed /home/alr48/ld-elf-x86.after.so.1
The number of R_X86_64_RELATIVE relocations that need to be processed on
startup has also gone down from 368 to 187 (almost 50% less).
Reviewed By: kib
Differential Revision: https://reviews.freebsd.org/D20663
Summary:
PowerPC has two PLT models: BSS-PLT and Secure-PLT. BSS-PLT uses runtime
code generation to generate the PLT stubs. Secure-PLT was introduced with
GCC 4.1 and Binutils 2.17 (base has GCC 4.2.1 and Binutils 2.17), and is a
more secure PLT format, using a read-only linkage table, with the dynamic
linker populating a non-executable index table.
This is the libc, rtld, and kernel support only. The toolchain and build
parts will be updated separately.
Reviewed By: nwhitehorn, bdragon, pfg
Differential Revision: https://reviews.freebsd.org/D20598
MFC after: 1 month
Use roundup2() and rounddown2() instead of inlining them.
Get rid of the fd local variable, use literal -1 for the mmap argument.
Use MAP_FAILED as mmap(2) failure indicator.
After that, apply some style.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
kern_execve() locks text vnode exclusive to be able to set and clear
VV_TEXT flag. VV_TEXT is mutually exclusive with the v_writecount > 0
condition.
The change removes VV_TEXT, replacing it with the condition
v_writecount <= -1, and puts v_writecount under the vnode interlock.
Each text reference decrements v_writecount. To clear the text
reference when the segment is unmapped, it is recorded in the
vm_map_entry backed by the text file as MAP_ENTRY_VN_TEXT flag, and
v_writecount is incremented on the map entry removal
The operations like VOP_ADD_WRITECOUNT() and VOP_SET_TEXT() check that
v_writecount does not contradict the desired change. vn_writecheck()
is now racy and its use was eliminated everywhere except access.
Atomic check for writeability and increment of v_writecount is
performed by the VOP. vn_truncate() now increments v_writecount
around VOP_SETATTR() call, lack of which is arguably a bug on its own.
nullfs bypasses v_writecount to the lower vnode always, so nullfs
vnode has its own v_writecount correct, and lower vnode gets all
references, since object->handle is always lower vnode.
On the text vnode' vm object dealloc, the v_writecount value is reset
to zero, and deadfs vop_unset_text short-circuit the operation.
Reclamation of lowervp always reclaims all nullfs vnodes referencing
lowervp first, so no stray references are left.
Reviewed by: markj, trasz
Tested by: mjg, pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D19923
- Remove dead and most likely rotten MALLOC_DEBUG, MSTAT, and RCHECK options.
- Remove unused headers.
- Remove one case of undefined behavior where left shift could overflow.
It is impossible on practice for rtld and libthr consumer.
PR: 237577
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Since inits for the main binary are run from rtld (for some time), the
rtld_exit atexit(3) handler, which is passed from rtld to the program
entry and installed by csu, is installed after any atexit(3) handlers
installed by main binary constructors. This means that rtld_exit() is
fired before main binary handlers.
Typical C++ static constructors are executed from init (either binary
or libs) but use atexit(3) to ensure that destructors are called in
the right order, independent of the linking order. Also, C++
libraries finalizers call __cxa_finalize(3) to flush library'
atexit(3) entries. Since atexit(3) entry is cleared after being run,
this would be mostly innocent, except that, atexit(rtld_exit) done
after main binary constructors, makes destructors from libraries
executed before destructors for main.
Fix by reordering atexit(rtld_exit) before inits for main binary, same
as it happened when inits were called by csu. Do it using new private
libc symbol with pre-defined ABI.
Reported. tested, and reviewed by: kan
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
This causes some increase of the dynamic linker size, but benefits of
avoiding compiling private copy or the linker when debugging is
required. definitely worth it.
The dbg() calls can be compiled out by defining LD_NO_DEBUG symbol.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
If dso uses initial exec TLS mode, rtld tries to allocate TLS in
static space. If there is no space left, the dlopen(3) fails. If space
if allocated, initial content from PT_TLS segment is distributed to
all threads' pcbs, which was missed and caused un-initialized TLS
segment for such dso after dlopen(3).
The mode is auto-detected either due to the relocation used, or if the
DF_STATIC_TLS dynamic flag is set. In the later case, the TLS segment
is tried to allocate earlier, which increases chance of the dlopen(3)
to succeed. LLD was recently fixed to properly emit the flag, ld.bdf
did it always.
Initial test by: dumbbell
Tested by: emaste (amd64), ian (arm)
Tested by: Gerald Aryeetey <aryeeteygerald_rogers.com> (arm64)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D19072
allocate_tls_offset returns true on success. The same issue existed
on arm and was fixed in r345693.
PR: 236880
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
allocate_tls_offset returns true on success. This still needs more
testing and review, but this change is consistent with other archs.
PR: 236880
Reported by: Andrew Gierth <andrew@tao11.riddles.org.uk>
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
r345620 by kib@ fixed the rtld issue that caused a crash at startup
during resolution of libc's ifuncs with BIND_NOW.
PR: 233333
Sponsored by: The FreeBSD Foundation
Building binaries as PIE allows the executable itself to be loaded at a
random address when ASLR is enabled (not just its shared libraries).
With this change PIE objects have a .pieo extension and INTERNALLIB
libraries libXXX_pie.a.
MK_PIE is disabled for some kerberos5 tools, Clang, and Subversion, as
they explicitly reference .a libraries in their Makefiles. These can
be addressed on an individual basis later. MK_PIE is also disabled for
rtld-elf because it is already position-independent using bespoke
Makefile rules.
Currently only dynamically linked binaries will be built as PIE.
Discussed with: dim
Reviewed by: kib
MFC after: 1 month
Relnotes: Yes
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D18423
This allows to reuse the allocator in other environments that get
malloc(3) and related functions from libc or interposer.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D18988
removed as part of r341441.
This call to reloc_non_plt() may crash if ifunc resolvers use the
needed libraries symbols since the pass over the needed libs
relocation is not yet done. The change in r341441 ensures the right
relocation order otherwise.
Submitted by: theraven
MFC after: 1 week
Discussed in: https://reviews.freebsd.org/D17529
Summary: reloc_jmpslot function parameter 'defobj' is not used when using ELFv2
ABI
Submitted by: alfredo.junior_eldorado.org.br
Reviewed By: kib, git_bdragon.rtk0.net, emaste, jhibbits
Differential Revision: https://reviews.freebsd.org/D18808
We need to subtract the TLS_TCB_SIZE to get to the real data pointer, since
r13 points to the end of the TCB structure. Prior to this, devel/protobuf-c
port broke with recent update to devel/protobuf, which exposed this issue.
Submitted by: andreast
Reported by: Piotr Kubaj
MFC after: 1 week
The original code did not support dynamically loaded libraries and used
suboptimal access to TLS variables.
New implementation removes lazy resolving of TLS relocation - due to flaw
in TLSDESC design is impossible to switch resolver function at runtime
without expensive locking.
Due to this, 3 specialized resolvers are implemented:
- load time resolver for TLS relocation from libraries loaded with main
executable (thus with known TLS offset).
- resolver for undefined thread weak symbols.
- slower lazy resolver for dynamically loaded libraries with fast path for
already resolved symbols.
PR: 228892, 232149, 233204, 232311
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D18417
Although these are slightly obsolete in favor of R_AARCH64_TLSDESC,
gcc -mtls-dialect=trad still use them.
Please note that definition of TLS_DTPMOD64 and TLS_DTPREL64 are incorrectly
exchanged in GNU binutils. TLS_DTPREL64 should be encoded to 1028 (as is
defined in ARM ELF ABI) but binutils encode it to 1029. And vice versa,
TLS_DTPMOD64 should be encoded to 1029 but binutils encode it to 1028.
While I'm in, add also R_AARCH64_NONE. It can be produced as result of linker
relaxation.
MFC after: 1 week
- don't relocate jump slots multiple times (if LD_BIND_NOW is defined).
- process only R_AARCH64_JUMP_SLOT here, other relocation types are handled
by reloc_plt().
MFC after: 1 week
- Do not perform ifunc relocations together with other PLT relocations
in PLT. Instead, do it during an additional pass over the init
list, so that ifuncs are resolved in the order of dso
dependencies. This allows the ifuncs resolvers to call into depended
libs. Init list now includes all objects instead of only objects
with init/fini callables.
- Disable relro protection around bind_now ifunc relocations.
I considered calling ifunc resolvers of dso after initializers of all
dependencies are processed, and decided that this is wrong/should not
be supported. The order now is normal relocations for all
objects->ifunc resolution in init order->initializers, where each step
does complete pass over all loaded objects before moving to the next
step.
Reported, tested and reviewed by: emaste
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D18400
bzero(3) for rtld.
This again reduces rtld dependency on libc, and in future, avoid ifunc
relocations when the functions are converted to ifuncs in libc.
Reported by: mjg
Reviewed by: emaste
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D18400
An issue remains with BIND_NOW and processes using threads. For now,
restore libc's BIND_NOW disable, and also disable BIND_NOW in rtld and
libthr.
A patch is in review (D18400) that likely fixes this issue, but just
disable BIND_NOW pending further testing after it is committed.
PR: 233333
Sponsored by: The FreeBSD Foundation
The function reloc_non_plt has complicated variable lifetimes that GCC 6.4.0
(the version currently used by amd64-xtoolchain-gcc) misunderstands and
produces an erroneous warning about. Silence it to allow the -Werror build
to proceed.
Reviewed by: emaste