lock against themselves, causing infinite spinning. Brian Feldman
found this problem when testing with Mozilla and supplied the fix,
which I have revised slightly.
Here is the failure scenario. A thread calls dlopen() and acquires
the writer lock. While the thread still holds the lock, a signal
is delivered and caught. The signal handler tries to call a function
which hasn't been bound yet. It thus enters the dynamic linker
and tries to acquire the reader lock. Since the writer lock is
already held, it will spin forever in the signal handler. The
thread holding the lock won't be able to progress and release the
lock.
The solution is to block almost all signals while holding the
exclusive lock.
A similar problem could conceivably occur in the opposite order.
Namely, a thread is holding the reader lock and then a signal
handler calls dlopen() or dlclose() and spins waiting for the writer
lock. We deal with this administratively by proclaiming that signal
handlers aren't allowed to call dlopen() or dlclose(). Actually
we don't have to proclaim a thing, since signal handlers aren't
allowed to call any system functions except those which are explicitly
permitted.
Submitted by: Brian Fundakowski Feldman <green>
and for all (I hope). Packages such as wine, JDK, and linuxthreads
should no longer have any problems with re-entering the dynamic
linker.
This commit replaces the locking used in the dynamic linker with a
new spinlock-based reader/writer lock implementation. Brian
Fundakowski Feldman <green> argued for this from the very beginning,
but it took me a long time to come around to his point of view.
Spinlocks are the only kinds of locks that work with all thread
packages. But on uniprocessor systems they can be inefficient,
because while a contender for the lock is spinning the holder of the
lock cannot make any progress toward releasing it. To alleviate
this disadvantage I have borrowed a trick from Sleepycat's Berkeley
DB implementation. When spinning for a lock, the requester does a
nanosleep() call for 1 usec. each time around the loop. This will
generally yield the CPU to other threads, allowing the lock holder
to finish its business and release the lock. I chose 1 usec. as the
minimum sleep which would with reasonable certainty not be rounded
down to 0.
The formerly machine-independent file "lockdflt.c" has been moved
into the architecture-specific subdirectories by repository copy.
It now contains the machine-dependent spinlocking code. For the
spinlocks I used the very nifty "simple, non-scalable reader-preference
lock" which I found at
<http://www.cs.rochester.edu/u/scott/synchronization/pseudocode/rw.html>
on all CPUs except the 80386 (the specific CPU model, not the
architecture). The 80386 CPU doesn't support the necessary "cmpxchg"
instruction, so on that CPU a simple exclusive test-and-set lock
is used instead. 80386 CPUs are detected at initialization time by
trying to execute "cmpxchg" and catching the resulting SIGILL
signal.
To reduce contention for the locks, I have revamped a couple of
key data structures, permitting all common operations to be done
under non-exclusive (reader) locking. The only operations that
require exclusive locking now are the rare intrusive operations
such as dlopen() and dlclose().
The dllockinit() interface is now deprecated. It still exists,
but only as a do-nothing stub. I plan to remove it as soon as is
reasonably possible. (From the very beginning it was clearly
labeled as experimental and subject to change.) As far as I know,
only the linuxthreads port uses dllockinit(). This interface turned
out to have several problems. As one example, when the dynamic
linker called a client-supplied locking function, that function
sometimes needed lazy binding, causing re-entry into the dynamic
linker and a big looping mess. And in any case, it turned out to be
too burdensome to require threads packages to register themselves
with the dynamic linker.
"ld-elf.so.1.old". The dynamic linker is a critical component of
the system, and it is difficult to recover if it is damaged and
there isn't a working backup available. For instance, parts of
the toolchain such as the assembler are dynamically linked, making
it impossible to build a new dynamic linker if the installed one
doesn't work.
DWARF2 exception tables emitted by the compiler for C++ sources.
These tables are tightly packed, and they contain some relocated
addresses which are not well-aligned.
figure out which shared object(s) contain the the locking methods
and fully bind those objects as if they had been loaded with
LD_BIND_NOW=1. The goal is to keep the locking methods from
requiring any lazy binding. Otherwise infinite recursion occurs
in _rtld_bind.
This fixes the infinite recursion problem in the linuxthreads port.
just a few of them. This looks like it solves the recent
ld-elf.so.1: assert failed: /usr/src/libexec/rtld-elf/lockdflt.c:55
failures seen by some applications such as JDK.
init and fini functions. Now the code is very careful to hold no
locks when calling these functions. Thus the dynamic linker cannot
be re-entered with a lock already held.
Remove the tolerance for recursive locking that I added in revision
1.2 of dllockinit.c. Recursive locking shouldn't happen any more.
Mozilla and JDK users: I'd appreciate confirmation that things still
work right (or at least the same) with these changes.
locking functions. If an application loads a shared object with
dlopen() and the shared object has an init function which requires
lazy binding, then _rtld_bind is called when the thread is already
inside the dynamic linker. This leads to a recursive acquisition
of the lock, which I was not expecting -- hence the assert failure.
This work-around makes the default locking functions handle recursive
locking. It is NOT the correct fix -- that should be implemented
at the generic locking level rather than in the default locking
functions. I will implement the correct fix in a future commit.
Since the dllockinit() interface will likely need to change, warn
about that in both the man page and the header file.
functions to be used by the dynamic linker. This can be called by
threads packages at start-up time. I will add the call to libc_r
soon.
Also add a default locking method that is used up until dllockinit()
is called. The default method works by blocking SIGVTALRM, SIGPROF,
and SIGALRM in critical sections. It is based on the observation
that most user-space threads packages implement thread preemption
with one of these signals (usually SIGVTALRM).
The dynamic linker has never been reentrant, but it became less
reentrant in revision 1.34 of "src/libexec/rtld-elf/rtld.c".
Starting with that revision, multiple threads each doing lazy
binding could interfere with each other. The usual symptom was
that a symbol was falsely reported as undefined at start-up time.
It was rare but not unseen. This commit fixes it.
libjava peeks into the dynamic linker's private Obj_Entry structures.
My recent changes introduced some new members near the front of
the structures, causing libjava to get the wrong fields. This commit
moves the new members toward the end of the structure so that the
layout of the portion that is relevant to JDK remains the same as
before.
I will work with the JDK porting team to see if we can come up with
a less fragile way for them to do what they need to do. I understand
the current approach was necessary in order to work around some
limitations of the dynamic linker. Maybe it's not necessary any
more.
PT_INTERP program header entry, to ensure that gdb always finds
the right dynamic linker.
Use obj->relocbase to simplify a few calculations where appropriate.
loaded separately by dlopen that have global symbols with identical
names. Viewing each dlopened object as a DAG which is linked by its
DT_NEEDED entries in the dynamic table, the search order is as
follows:
* If the referencing object was linked with -Bsymbolic, search it
internally.
* Search all dlopened DAGs containing the referencing object.
* Search all objects loaded at program start up.
* Search all objects which were dlopened() using the RTLD_GLOBAL
flag (which is now supported too).
The search terminates as soon as a strong definition is found.
Lacking that, the first weak definition is used.
These rules match those of Solaris, as best I could determine them
from its vague manual pages and the results of experiments I performed.
PR: misc/12438
violations in certain obscure cases involving failed dlopens. Many
thanks to Archie Cobbs for providing me with a good test case.
Eliminate a block that existed only to localize a declaration.
the dynamic linker didn't clean up properly. A subsequent dlopen()
of the same object would appear to succeed.
Another excellent fix from Max Khon.
PR: bin/12471
Submitted by: Max Khon <fjoe@iclub.nsu.ru>
discovered by Hidetoshi Shimokawa. Large programs need multiple
GOTs. The lazy binding stub in the PLT can be reached from any of
these GOTs, but the dynamic linker only has enough information to
fix up the first GOT entry. Thus calls through the other GOTs went
through the time-consuming lazy binding process on every call.
This fix rewrites the PLT entries themselves to bypass the lazy
binding.
Tested by Hidetoshi Shimokawa and Steve Price.
Reviewed by: Doug Rabson <dfr@freebsd.org>
function. It was an ill-considered feature. It didn't solve the
problem I wanted it to solve. And it added Yet Another Version
Number that would have to be maintained at every release point.
I'm nuking it now before anybody grows too fond of it.
_init() functions, initialize the global variables "__progname" and
"environ". This makes it possible for the _init() functions to call
things like getenv() and err().
the Makefile, and move it down into the architecture-specific
subdirectories.
Eliminate an asm() statement for the i386.
Make the dynamic linker work if it is built as an executable instead
of as a shared library. See i386/Makefile.inc to find out how to
do it. Note, this change is not enabled and it might never be
enabled. But it might be useful in the future. Building the
dynamic linker as an executable should make it start up faster,
because it won't have any relocations. But in practice I suspect
the difference is negligible.
avoid crashing inside rtld (since it's easy) since everything else handles
it. Of course, if the target program checks argv[], it'll fall over.
Reviewed by: jdp
References from GDB to "printf" and various other functions would
find the versions in the dynamic linker itself, rather than the
versions in the program's libc. This fix moves the GDB link map
entry for the dynamic linker to the end of the search list, where
its symbols will be found only if they are not found anywhere else.
It was suggested by Doug Rabson, though I implemented it a little
differently.
I personally would prefer to leave the dynamic linker's entry out
of the GDB search list altogether. But Doug argues that it is
handy there for such things as setting breakpoints on dlopen().
So it stays for now, at least.
Note, if we ever integrate the dynamic linker with libc (which has
several important benefits to recommend it), this whole problem
goes away.
dynamic linker itself dynamically allocated. All of them are
supposed to be dynamically allocated, but we cheated before. It
made gdb unhappy under some circumstances.
a different file than the a.out hints, namely, "/var/run/ld-elf.so.hints".
These hints consist only of the directory search path. There is
no hash table as in the a.out hints, because ELF doesn't have to
search for the file with the highest minor version number. (It
doesn't have minor version numbers at all.)
A single run of ldconfig updates either the a.out hints or the ELF
hints, but not both. The set of hints to process is selected in
the usual way, via /etc/objformat, or ${OBJFORMAT}, or the "-aout"
or "-elf" command line option. The rationale is that you probably
want to search different directories for ELF than for a.out.
"ldconfig -r" is faked up to produce output like we are used to,
except that for ELF there are no minor version numbers. This should
enable "ldconfig -r" to be used for checking LIB_DEPENDS in ports
even for ELF.
I implemented the ELF functionality in a new source file, with an
eye toward eliminating the a.out code entirely at some point in
the future.
shared object. Note, this searches _only_ that object, and not its
needed objects, in accordance with the documentation.
Also fix dlopen(NULL, ...) so that the executable's needed objects
are searched as well as the executable itself.
or Elf64 based on the inclusion of the machine dependent header.
I've left the addition of the extra fields to handle the relocation
structures with addend for a separate commit after jdp has had a chance
to review what I've done. The current change is needed to compile
csu/alpha/crt1.c
quite a few enhancements and bug fixes. There are still some known
deficiencies, but it should be adequate to get us started with ELF.
Submitted by: John Polstra <jdp@polstra.com>
If it is set to a nonempty string, then simply skip any missing
shared libraries. This came up in a discussion long ago as a
potentially useful feature at sysinstall time. For example, an
X11 utility could be used without the X libraries being present,
provided the utility had a mode in which no X functions were actually
called.
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
nonempty string, then function calls are relocated at program start-up
rather than lazily. This variable is standard on Sun and SVR4 systems.
The dlopen() function now supports both lazy and immediate binding, as
determined by its "mode" argument, which can be either 1 (RTLD_LAZY) or
2 (RTLD_NOW). I will add defines of these symbols to <dlfcn.h> as soon
as I've done a little more checking to make sure they won't cause
collisions or bootstrapping problems that would break "make world".
The "LD_*" environment variables which alter dynamic linker behavior are
now treated as unset if they are set to the empty string. This agrees
with the standard SVR4 conventions for the dynamic linker.
Add a work-around for programs compiled with certain buggy versions of
crt0.o. The buggy versions failed to set the "crt_ldso" member of the
interface structure. This caused certain error messages from the
dynamic linker to begin with "(null)" instead of the pathname of the
dynamic linker.
configurable fallback search paths, as well as new crt interface version.
Also:
- even faster getenv(), get all environment variable settings in a single
pass.
- ldd printf-like format specifications
- minor code cleanups, one vsprintf -> vsnprintf (harmless)
The library search sequence is a little more complete now. Before,
it'd search $LD_LIBRARY_PATH (by opendir/readdir/closedir), then read
the hints file, then read /usr/lib (again by scanning thr directory). It
would then fail if there was no "found" library.
Now, it does LD_LIBRARY_PATH and the hints file the same, but then uses
a longer fallback path. The -R path is fetched from the executable if
specified at build time, the ldconfig path is appended, and /usr/lib is
appended to that. Duplicates are suppressed. This means that simply
placing a new library in /usr/local/lib will work (the same as it did in
/usr/lib) without needing ldconfig -m. It will find it quicker if the
ldconfig is run though.
Similar changes have been made to the NetBSD ld.so, but ours is rather
different now due to John Polstra's speedups and fixes from a while back.
The ldd printf-like format support came direct from NetBSD.
Reviewed by: nate, jdp
descriptions of LD_NO_INTERN_SEARCH and LD_NOSTD_PATH from the manual
page, since they are not supported.
Submitted by: Doug Ambrisko <ambrisko@ambrisko.roble.com>