Linkers are supposed to mark PIE binaries with DF_1_PIE, such binary
cannot be correctly and usefully loaded neither by dlopen(3) nor as a
dependency of other object. For instance, we cannot do anything
useful with COPY relocations, among other things.
Glibc already added similar restriction.
Requested and reviewed by: emaste
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D25086
For PIE binaries, ldd(1) performs dlopen(RTLD_TRACE) on the binary.
It is legal for binary to use initial exec TLS mode, but when such
binary (actually dso) is dlopened, we might not have enough free space
in the finalized static TLS segment. Make ldd operational by skipping
TLS space allocation, we are not going to execute any code from the
dso anyway.
Reported by: tobik
PR: 245677
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
See https://sourceware.org/bugzilla/show_bug.cgi?id=24606 for the test case.
See https://reviews.llvm.org/D64930 for the background and more discussion.
Also this fixes another bug in malloc_aligned() where total size of
the allocated memory might be not enough to fit the aligned requested
block after the initial pointer is incremented by the pointer size.
Reviewed by: bdragon
Tested by: antoine (exp-run PR 244866), bdragon, emaste
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D21163
lld 10.0 seems to generate this relocation for rdtsc_mb() ifunc in our libc.
Reported, reviewed, and tested by: dim (amd64, previous version)
Discussed with: emaste
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D23652
This allows for rtld to not issue two sigprocmask(2) syscalls for each
symbol binding operation in single-threaded processes. Rtld needs to
block signals as part of locking to ensure signal safety of the bind
process, because signal handlers might need to lazily resolve symbol
references.
As result, number of syscalls issued on startup by simple programs not
using libthr, is typically reduced 2x. For instance, for hello world,
I see:
non-sigfastblock
# (truss ./hello > /dev/null) |& wc -l
63
sigfastblock
# (truss ./hello > /dev/null) |& wc -l
37
Tested by: pho
Disscussed with: cem, emaste, jilles
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D12773
Summary:
PowerPC has two PLT models: BSS-PLT and Secure-PLT. BSS-PLT uses runtime
code generation to generate the PLT stubs. Secure-PLT was introduced with
GCC 4.1 and Binutils 2.17 (base has GCC 4.2.1 and Binutils 2.17), and is a
more secure PLT format, using a read-only linkage table, with the dynamic
linker populating a non-executable index table.
This is the libc, rtld, and kernel support only. The toolchain and build
parts will be updated separately.
Reviewed By: nwhitehorn, bdragon, pfg
Differential Revision: https://reviews.freebsd.org/D20598
MFC after: 1 month
If dso uses initial exec TLS mode, rtld tries to allocate TLS in
static space. If there is no space left, the dlopen(3) fails. If space
if allocated, initial content from PT_TLS segment is distributed to
all threads' pcbs, which was missed and caused un-initialized TLS
segment for such dso after dlopen(3).
The mode is auto-detected either due to the relocation used, or if the
DF_STATIC_TLS dynamic flag is set. In the later case, the TLS segment
is tried to allocate earlier, which increases chance of the dlopen(3)
to succeed. LLD was recently fixed to properly emit the flag, ld.bdf
did it always.
Initial test by: dumbbell
Tested by: emaste (amd64), ian (arm)
Tested by: Gerald Aryeetey <aryeeteygerald_rogers.com> (arm64)
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D19072
This allows to reuse the allocator in other environments that get
malloc(3) and related functions from libc or interposer.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential revision: https://reviews.freebsd.org/D18988
The original code did not support dynamically loaded libraries and used
suboptimal access to TLS variables.
New implementation removes lazy resolving of TLS relocation - due to flaw
in TLSDESC design is impossible to switch resolver function at runtime
without expensive locking.
Due to this, 3 specialized resolvers are implemented:
- load time resolver for TLS relocation from libraries loaded with main
executable (thus with known TLS offset).
- resolver for undefined thread weak symbols.
- slower lazy resolver for dynamically loaded libraries with fast path for
already resolved symbols.
PR: 228892, 232149, 233204, 232311
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D18417
- Do not perform ifunc relocations together with other PLT relocations
in PLT. Instead, do it during an additional pass over the init
list, so that ifuncs are resolved in the order of dso
dependencies. This allows the ifuncs resolvers to call into depended
libs. Init list now includes all objects instead of only objects
with init/fini callables.
- Disable relro protection around bind_now ifunc relocations.
I considered calling ifunc resolvers of dso after initializers of all
dependencies are processed, and decided that this is wrong/should not
be supported. The order now is normal relocations for all
objects->ifunc resolution in init order->initializers, where each step
does complete pass over all loaded objects before moving to the next
step.
Reported, tested and reviewed by: emaste
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D18400
It is unused after r340102, and more important, I do not see how to
define textsize in both practically useful and correct way, for binaries
with more that one executable segments.
Sponsored by: The FreeBSD Foundation
objects' init functions instead of doing the setup via a constructor
in libc as the init functions may already depend on these handlers
to be in place. This gets us rid of:
- the undefined order in which libc constructors as __guard_setup()
and jemalloc_constructor() are executed WRT __sparc_utrap_setup(),
- the requirement to link libc last so __sparc_utrap_setup() gets
called prior to constructors in other libraries (see r122883).
For static binaries, crt1.o still sets up the user trap handlers.
o Move misplaced prototypes for MD functions in to the MD prototype
section of rtld.h.
o Sprinkle nitems().
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
No functional change intended.
Newer binutils supports extensions to the MIPS ABI for non-PIC code
that is used when compiling O32 binaries with clang 5 (but not used
for N64 oddly enough). These extensions require support for
R_MIPS_COPY relocations as well as a second PLT GOT using
R_MIPS_JUMP_SLOT relocations.
For R_MIPS_COPY, use the same approach as on other architectures where
fixups are deferred to the MD do_copy_relocations.
The additional PLT GOT for jump slots is located in a .got.plt section
which is identified by a DT_MIPS_PLTGOT dynamic entry. This GOT also
requires fixups for the first two GOT entries just as the normal GOT.
However, the entry point for this second GOT uses a different calling
convention. Rather than passing an offset into the GOT, it passes an
offset into the .rel.plt section. This requires a second entry point
(_rtld_pltbind_start) which calls the normal _rtld_bind() rather than
_mips_rtld_bind(). This also means providing a real version of
reloc_jmpslot() which is used by _rtld_bind().
In addition, add real implementions of reloc_plt() and
reloc_jmpslots() which walk .rel.plt handling R_MIPS_JUMP_SLOT
relocations.
Reviewed by: kib
Sponsored by: DARPA / AFRL
Differential Revision: https://reviews.freebsd.org/D12326
From the manpage:
When set to a nonempty string, prevents modifications of the PLT slots
when doing bindings. As result, each call of the PLT-resolved
function is resolved. In combination with debug output, this provides
complete account of all bind actions at runtime.
Same feature exists on Linux and Solaris.
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
rtld drops the bind lock to call fini functions in an object prior to
unmapping it. The new "doomed" state flag prevents the acquisition of new
references for an object while the lock is dropped.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
Add a transient reference count to ensure that the phdr argument to the
callback remains valid while the bind lock is dropped.
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: Dell EMC Isilon
(hopefully) stock gcc 4.2.1 on i386 and other arches.
In particular:
- Do not use %ebx in the asm constraints on i386, since rtld is
compiled with -fPIC and gcc cannot handle GOT-base register reload
(clang and newer gcc can).
- Avoid direct use of [static N] construct in the function
declaration/definion. In-tree gcc was patched to support this, but
stock 4.2.1 cannot handle the feature.
Requested by: bde
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
CPUID[7].%ebx (cpu_stdext_feature), %ecx (cpu_stdext_feature2) to the
ifunc resolvers on x86.
It is much more clean to use CPUID instruction in usermode to retrieve
this information than to pass AT_HWCAP aux vector from kernel, on
x86. Still, the change does allow for use of AT_HWCAP on arches where it is
needed, by passing aux array to ifunc_init() initializer which should
prepare arguments for ifunc resolvers.
Current signature for resolvers on x86 is
func_t iresolve(uint32_t cpu_feature, uint32_t cpu_feature2,
uint32_t cpu_stdext_feature, uint32_t cpu_stdext_feature2);
where arguments have identical meaning as the kernel variables of the
same name. The ABIs allow to use resolvers with the void or shortened
list of arguments.
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D8448
segment. According to gABI spec, presence of the tag indicates that
dynamic linker must be prepared to handle relocations against any
read-only segment, not only the segment which we, somewhat arbitrary,
declared the text.
For each read-only segment, add write permission before relocs are
processed, and return to the mapping mode requested by the phdr, after
relocs are done.
Reported, tested, and reviewed by: emaste
PR: 207631
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
phdr locks locked. This allows to call rtld services from the
callback, which is only reasonable for dlopen(path, RTLD_NOLOAD) to
test existence of the library in the image, and for dlsym(). The
later might still be not quite safe, due to the lazy resolution of
filters.
To allow dropping the locks around iteration in dl_iterate_phdr(3), we
insert markers to track current position between relocks. The global
objects list is converted to tailq and all iterators skip markers,
globallist_next() and globallist_curr() helpers are added.
Reported and tested by: davide
Reviewed by: kan
Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks
as always participating in the global symbols namespace, regardless of
the way the object was brought into the process address space.
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
rtld on x86 to be hidden. This is a micro-optimization, which allows
intrinsic references inside rtld to be handled without indirection
through PLT. The visibility of rtld symbols for other objects in the
symbol namespace is controlled by a version script.
Reviewed by: kan, jilles
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
e.g. when a global variable is initialized with a pointer to ifunc.
Add symbol type check and call resolver for STT_GNU_IFUNC symbol types
when processing non-PLT relocations, but only after non-IFUNC
relocations are done. The two-phase proceessing is required since
resolvers may reference other symbols, which must be ready to use when
resolver calls are done.
Restructure reloc_non_plt() on x86 to call find_symdef() and handle
IFUNC in single place.
For non-x86 reloc_non_plt(), check for call for IFUNC relocation and
do nothing, to avoid processing relocs twice.
PR: 193048
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
first calls mmap() with the arguments PROT_NONE and MAP_ANON to reserve a
single, contiguous range of virtual addresses for the entire shared library.
Later, rtld calls mmap() with the the shared library's file descriptor
and the argument MAP_FIXED to place the text and data sections within the
reserved range. The rationale for mapping shared libraries in this way is
explained in the commit message for Revision 190885. However, this approach
does have an unintended, negative consequence. Since the first call to
mmap() specifies MAP_ANON and not the shared library's file descriptor, the
kernel has no idea what alignment the vm object backing the file prefers.
As a result, the reserved range's alignment is unlikely to be the same as
the vm object's, and so mapping with superpages becomes impossible. To
address this problem, this revision adds the argument MAP_ALIGNED_SUPER to
the first call to mmap() if the text section is larger than the smallest
superpage size.
To determine if the text section is larger than the smallest superpage
size, rtld must always fetch the page size information. As a result, the
private code for fetching the base page size in rtld's builtin malloc is
redundant. Eliminate it. Requested by: kib
Tested by: zbb (on arm)
Reviewed by: kib (an earlier version)
Discussed with: jhb
by John Marino <draco@marino.st>, with the following (edited) commit
message
Date: Sat, 24 Mar 2012 06:40:50 +0100
Subject: [PATCH 1/1] rtld: Implement DT_RUNPATH and -z nodefaultlib
DT_RUNPATH is incorrectly being considered as an alias of DT_RPATH. The
purpose of DT_RUNPATH is to have two different types of rpath: one that
can be overridden by the environment variable LD_LIBRARY_PATH and one that
can't. With the currently implementation, LD_LIBRARY_PATH will always
trump any embedded rpath or runpath tags.
Current path search order by rtld:
==================================
LD_LIBRARY_PATH
DT_RPATH / DT_RUNPATH (always the same)
ldconfig hints file (default: /var/run/ld-elf.so.hints)
/usr/lib
New path search order by rtld:
==============================
DT_RPATH of the calling object if no DT_RUNPATH
DT_RPATH of the main binary if no DT_RUNPATH and binary isn't calling obj
LD_LIBRARY_PATH
DT_RUNPATH
ldconfig hints file
/usr/lib
The new path search matches how the linux runtime loader works. The other
major added feature is support for linker flag "-z nodefaultlib". When
this flag is passed to the linker, rtld will skip all references to the
standard library search path ("/usr/lib" in this case but it could handle
more color delimited paths) except in DT_RPATH and DT_RUNPATH.
New path search order by rtld with -z nodefaultlib flag set:
============================================================
DT_RPATH of the calling object if no DT_RUNPATH
DT_RPATH of the main binary if no DT_RUNPATH and binary isn't calling obj
LD_LIBRARY_PATH
DT_RUNPATH
ldconfig hints file (skips all references to /usr/lib)
FreeBSD notes:
- we fixed some bugs which were submitted to DragonFly and merged there
as commit 1ff8a2bd3eb6e5587174c6a983303ea3a79e0002;
- we added LD_LIBRARY_PATH_RPATH environment variable to switch to
the previous behaviour of considering DT_RPATH a synonym for DT_RUNPATH;
- the FreeBSD default search path is /lib:/usr/lib and not /usr/lib.
Reviewed by: kan
MFC after: 1 month
MFC note: flip the ld_library_path_rpath default value for stable/9
hash elements, and a helper matched_symbol() which match the given hash
entry and request, performing needed type and version checks.
Based on dragonflybsd support for GNU hash by John Marino <draco marino st>
Reviewed by: kan
Tested by: bapt
MFC after: 2 weeks
for the same object. This can happen when object is a dependency of the
dlopen()ed dso. When called several times, we waste time due to unneeded
processing, and memory, because obj->vertab is allocated anew on each
iteration.
Reviewed by: kan
MFC after: 2 weeks
are assumed to not fail.
Make the xcalloc() calling conventions follow the calloc(3) calling
conventions and replace unchecked calls to calloc() with calls to
xcalloc().
Remove redundand declarations from xmalloc.c, which are already
present in rtld.h.
Reviewed by: kan
Discussed with: bde
MFC after: 2 weeks
Do not relocate twice an object which happens to be needed by loaded
binary (or dso) and some filtee opened due to symbol resolution when
relocating need objects. Record the state of the relocation
processing in Obj_Entry and short-circuit relocate_objects() if
current object already processed.
Do not call constructors for filtees loaded during the early
relocation processing before image is initialized enough to run
user-provided code. Filtees are loaded using dlopen_object(), which
normally performs relocation and initialization. If filtee is
lazy-loaded during the relocation of dso needed by the main object,
dlopen_object() runs too earlier, when most runtime services are not
yet ready.
Postpone the constructors call to the time when main binary and
depended libraries constructors are run, passing the new flag
RTLD_LO_EARLY to dlopen_object(). Symbol lookups callers inform
symlook_* functions about early stage of initialization with
SYMLOOK_EARLY. Pass flags through all functions participating in
object relocation.
Use the opportunity and fix flags argument to find_symdef() in
arch-specific reloc.c to use proper name SYMLOOK_IN_PLT instead of
true, which happen to have the same numeric value.
Reported and tested by: theraven
Reviewed by: kan
MFC after: 2 weeks
Stop using strerror(3) in rtld, which brings in msgcat and stdio.
Directly access sys_errlist array of errno messages with private
rtld_strerror() function.
Now,
$ size /libexec/ld-elf.so.1
text data bss dec hex filename
96983 2480 8744 108207 1a6af /libexec/ld-elf.so.1
Reviewed by: dim, kan
MFC after: 2 weeks
particular on ARM, do require working init arrays.
Traditional FreeBSD crt1 calls _init and _fini of the binary, instead
of allowing runtime linker to arrange the calls. This was probably
done to have the same crt code serve both statically and dynamically
linked binaries. Since ABI mandates that first is called preinit
array functions, then init, and then init array functions, the init
have to be called from rtld now.
To provide binary compatibility to old FreeBSD crt1, which calls _init
itself, rtld only calls intializers and finalizers for main binary if
binary has a note indicating that new crt was used for linking. Add
parsing of ELF notes to rtld, and cache p_osrel value since we parsed
it anyway.
The patch is inspired by init_array support for DragonflyBSD, written
by John Marino.
Reviewed by: kan
Tested by: andrew (arm, previous version), flo (sparc64, previous version)
MFC after: 3 weeks