Add an lld option to emit PC-relative relocations for ifunc calls.

The current kernel ifunc implementation creates a PLT entry for each
ifunc definition.  ifunc calls therefore consist of a call to the
PLT entry followed by an indirect jump.  The jump target is written
during boot when the kernel linker resolves R_[*]_IRELATIVE relocations.
This implementation is defined by requirements for userland code, where
text relocations are avoided.  This requirement is not present for the
kernel, so the implementation has avoidable overhead (namely, an extra
indirect jump per call).

Address this for now by adding a special option to the static linker
to inhibit PLT creation for ifuncs.  Instead, relocations to ifunc call
sites are passed through to the output file, so the kernel linker can
enumerate such call sites and apply PC-relative relocations directly
to the text section.  Thus the overhead of an ifunc call becomes exactly
the same as that of an ordinary function call.  This option is only for
use by the kernel and will not work for regular programs.

The final form of this optimization is up for debate; for now, this
change is simple and static enough to be acceptable as an interim
solution.

Reviewed by:	emaste
Discussed with:	arichardson, dim
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16748
This commit is contained in:
Mark Johnston 2018-08-23 14:58:19 +00:00
parent 35437b1f16
commit 4023442dc9
Notes: svn2git 2020-12-20 02:59:44 +00:00
svn path=/head/; revision=338251
5 changed files with 31 additions and 4 deletions

View File

@ -155,6 +155,7 @@ struct Configuration {
bool ZCombreloc;
bool ZExecstack;
bool ZHazardplt;
bool ZIfuncnoplt;
bool ZNocopyreloc;
bool ZNodelete;
bool ZNodlopen;

View File

@ -669,6 +669,7 @@ void LinkerDriver::readConfigs(opt::InputArgList &Args) {
Config->ZCombreloc = !hasZOption(Args, "nocombreloc");
Config->ZExecstack = hasZOption(Args, "execstack");
Config->ZHazardplt = hasZOption(Args, "hazardplt");
Config->ZIfuncnoplt = hasZOption(Args, "ifunc-noplt");
Config->ZNocopyreloc = hasZOption(Args, "nocopyreloc");
Config->ZNodelete = hasZOption(Args, "nodelete");
Config->ZNodlopen = hasZOption(Args, "nodlopen");

View File

@ -374,6 +374,9 @@ static bool isStaticLinkTimeConstant(RelExpr E, RelType Type, const Symbol &Sym,
R_PPC_PLT_OPD, R_TLSDESC_CALL, R_TLSDESC_PAGE, R_HINT>(E))
return true;
if (Sym.isGnuIFunc() && Config->ZIfuncnoplt)
return false;
// These never do, except if the entire file is position dependent or if
// only the low bits are used.
if (E == R_GOT || E == R_PLT || E == R_TLSDESC)
@ -921,7 +924,9 @@ static void scanRelocs(InputSectionBase &Sec, ArrayRef<RelTy> Rels) {
// Strenghten or relax a PLT access.
//
// GNU ifunc symbols must be accessed via PLT because their addresses
// are determined by runtime.
// are determined by runtime. If the -z ifunc-noplt option is specified,
// we permit the optimization of ifunc calls by omitting the PLT entry
// and preserving relocations at ifunc call sites.
//
// On the other hand, if we know that a PLT entry will be resolved within
// the same ELF module, we can skip PLT access and directly jump to the
@ -929,7 +934,7 @@ static void scanRelocs(InputSectionBase &Sec, ArrayRef<RelTy> Rels) {
// all dynamic symbols that can be resolved within the executable will
// actually be resolved that way at runtime, because the main exectuable
// is always at the beginning of a search list. We can leverage that fact.
if (Sym.isGnuIFunc())
if (Sym.isGnuIFunc() && !Config->ZIfuncnoplt)
Expr = toPlt(Expr);
else if (!Preemptible && Expr == R_GOT_PC && !isAbsoluteValue(Sym))
Expr =
@ -1034,6 +1039,16 @@ static void scanRelocs(InputSectionBase &Sec, ArrayRef<RelTy> Rels) {
continue;
}
// Preserve relocations against ifuncs if we were asked to do so.
if (Sym.isGnuIFunc() && Config->ZIfuncnoplt) {
if (Config->IsRela)
InX::RelaDyn->addReloc({Type, &Sec, Offset, false, &Sym, Addend});
else
// Preserve the existing addend.
InX::RelaDyn->addReloc({Type, &Sec, Offset, false, &Sym, 0});
continue;
}
// If the output being produced is position independent, the final value
// is still not known. In that case we still need some help from the
// dynamic linker. We can however do better than just copying the incoming

View File

@ -1400,8 +1400,11 @@ template <class ELFT> void Writer<ELFT>::finalizeSections() {
applySynthetic({InX::EhFrame},
[](SyntheticSection *SS) { SS->finalizeContents(); });
for (Symbol *S : Symtab->getSymbols())
for (Symbol *S : Symtab->getSymbols()) {
S->IsPreemptible |= computeIsPreemptible(*S);
if (S->isGnuIFunc() && Config->ZIfuncnoplt)
S->ExportDynamic = true;
}
// Scan relocations. This must be done after every symbol is declared so that
// we can correctly decide if a dynamic relocation is needed.

View File

@ -25,7 +25,7 @@
.\"
.\" $FreeBSD$
.\"
.Dd February 7, 2018
.Dd August 22, 2018
.Dt LD.LLD 1
.Os
.Sh NAME
@ -443,6 +443,13 @@ Make the main stack executable.
Stack permissions are recorded in the
.Dv PT_GNU_STACK
segment.
.It Cm ifunc-noplt
Do not emit PLT entries for GNU ifuncs.
Instead, preserve relocations for ifunc call sites so that they may
be applied by a run-time loader.
Note that this feature requires special loader support and will
generally result in application crashes when used outside of freestanding
environments.
.It Cm muldefs
Do not error if a symbol is defined multiple times.
The first definition will be used.