The variable _logname_valid is not exported via the version script;
therefore, change C and i386/amd64 assembler code to remove indirection
(which allowed interposition). This makes the code slightly smaller and
faster.
Also, remove #define PIC_GOT from i386/amd64 in !PIC mode. Without PIC,
there is no place containing the address of each variable, so there is no
possible definition for PIC_GOT.
check_deferred_signal() returns twice, since handle_signal() emulates
the return from the normal signal handler by sigreturn(2)ing the
passed context. Second return is performed on the destroyed stack
frame, because __fillcontextx() has already returned. This causes
undefined and bad behaviour, usually the victim thread gets SIGSEGV.
Avoid nested frame and the need to return from it by doing direct call
to getcontext() in the check_deferred_signal() and using a new private
libc helper __fillcontextx2() to complement the context with the
extended CPU state if the deferred signal is still present.
The __fillcontextx() is now unused, but is kept to allow older
libthr.so to be used with the new libc.
Mark __fillcontextx() as returning twice [1].
Reported by: pgj
Pointy hat to: kib
Discussed with: dim
Tested by: pgj, dim
Suggested by: jilles [1]
MFC after: 1 week
but use normal references instead of weak. This makes the statically
linked binaries to use fast gettimeofday(2) by forcing the linker to
resolve references and providing the neccessary functions.
Reported by: bde
Tested by: marius (sparc64)
MFC after: 2 weeks
For some reason, libc exports the symbol .cerror (HIDENAME(cerror)), albeit
in the FBSDprivate_1.0 version. It looks like there is no reason for this
since it is not used from other libraries. Given that it cannot be accessed
from C and its strange calling convention, it is rather unlikely that other
things rely on it. Perhaps it is from a time when symbols could not be
hidden.
Not exporting .cerror causes it to be jumped to directly instead of via the
PLT.
This change also takes advantage of .cerror's new status by not saving and
loading %ebx before jumping to it. (Therefore, .cerror now saves and loads
%ebx itself.) Where there was a conditional jump to a jump to .cerror, the
conditional jump has been changed to jump to .cerror directly (many modern
CPUs don't do static prediction and in any case it is not much of a benefit
anyway).
This change makes libc.so.7 a few kilobytes smaller.
Reviewed by: kib
clock_gettime(2) functions if supported. The speedup seen in
microbenchmarks is in range 4x-7x depending on the hardware.
Only amd64 and i386 architectures are supported. Libc uses rdtsc and
kernel data to calculate current time, if enabled by kernel.
Hopefully, this code is going to migrate into vdso in some future.
Discussed with: bde
Reviewed by: jhb
Tested by: flo
MFC after: 1 month
fit into existing mcontext_t.
On i386 and amd64 do return the extended FPU states using
getcontextx(3). For other architectures, getcontextx(3) returns the
same information as getcontext(2).
Tested by: pho
MFC after: 1 month
This allows people to still write statically linked applications that
call strchr() or strrchr() and have a local variable or function called
index.
Discussed with: bde@
As I looked through the C library, I noticed the FreeBSD MIPS port has a
hand-written version of index(). This is nice, if it weren't for the
fact that most applications call strchr() instead.
Also, on the other architectures index() and strchr() are identical,
meaning we have two identical pieces of code in the C library and
statically linked applications.
Solve this by naming the actual file strchr.[cS] and let it use
__strong_reference()/STRONG_ALIAS() to provide the index() routine. Do
the same for rindex()/strrchr().
This seems to make the C libraries and static binaries slightly smaller,
but this reduction in size seems negligible.
the word alignment, some versions of gcc do require 16-byte alignment.
Make sure the stack is 16-byte aligned before calling a subroutine.
Inspired by: PR amd64/162214
MFC after: 1 week
working MI one. The MI one only needs to be overridden on machines
with non-IEEE754 arithmetic. (The last supported one was the VAX.)
It can also be overridden if someone comes up with a faster one that
actually passes the regression tests -- but this is harder than it sounds.
On anything modern, the C version, which processes a word at a time, is much
faster. The Intel optimization manual explicitly warns against using REP
prefixes with SCAS or CMPS, which is exactly what the assembler version
does.
A simple test on a Phenom II showed the C version, compiled with -O2, to be
about twice as fast determining the length of 100000 strings between 0 and
255 bytes long.
MFC after: 2 weeks
as they are slower than the generic version in C, at least on modern
hardware. This leaves us with just five implementations.
Suggested by: bde
Approved by: rpaulo (mentor)
Reorder inline assembly arguments temp2, temp, value and texp to follow
the st(0), st(1), etc. style.
Also mark the temp2 variable as volatile to workaround another clang
bug.
This allows clang to buildworld FreeBSD/i386.
Submitted by: dim
Looking at our source code history, it seems the uname(),
getdomainname() and setdomainname() system calls got deprecated
somewhere after FreeBSD 1.1, but they have never been phased out
properly. Because we don't have a COMPAT_FREEBSD1, just use
COMPAT_FREEBSD4.
Also fix the Linuxolator to build without the setdomainname() routine by
just making it call userland_sysctl on kern.domainname. Also replace the
setdomainname()'s implementation to use this approach, because we're
duplicating code with sysctl_domainname().
I wasn't able to keep these three routines working in our
COMPAT_FREEBSD32, because that would require yet another keyword for
syscalls.master (COMPAT4+NOPROTO). Because this routine is probably
unused already, this won't be a problem in practice. If it turns out to
be a problem, we'll just restore this functionality.
Reviewed by: rdivacky, kib
instead of 32+32+15+1) on all arches that have such long doubles (amd64,
ia64 and i386). Large objects should be be accessed in large units,
and the 32+32+15+1[+padding] decomposition asks for almost the opposite
of that, sometimes resulting in very slow accesses depending on how
well the compiler ignores what we ask for and converts to the best
units for the given machine. E.g., on Athlons, there is a 10-20 cycle
penalty for accessing the middle 32-bit word immediately after an
80-bit store.
Whether actually using the alternative view is better is very machine-
dependent. A 32+32+16 view is probably best with old 32-bit systems
and gcc through 4.2.1. The compiler should mostly avoid the view and
generate best accesses, but gcc-4.2.1 is far from doing that. I think
64+16 is best for now. Similarly for doubles -- they should be using
64+0 especially on 64-bit machines, but fdlibm uses 32+32 extensively
for them. Fortunately, in 64-bit mode for doubles, gcc already ignores
the 32+32-bit view and generates best accesses in many cases.
my original implementation made both use the same code. Unfortunately,
this meant libm depended on a vendor header at compile time and previously-
unexposed vendor bits in libc at runtime.
Hence, I just wrote my own version of the relevant vendor routine. As it
turns out, mine has a factor of 8 fewer of lines of code, and is a bit more
readable anyway. The strtod() and *scanf() routines still use vendor code.
Reviewed by: bde
syscalls, unless WITHOUT_SYSCALL_COMPAT is defined. The default case
will have the .c wrappers still. If you define WITHOUT_SYSCALL_COMPAT,
the .c wrappers will go away and libc will make direct syscalls.
After 7-stable starts, the direct syscall method will be default.
Approved by: re (kensmith)
particular:
SYSCALL() makes a syscall, with errno handling, and continues execution
directly after the macro in the non-error case.
RSYSCALL() is just like SYSCALL(), but returns after success.
Both SYSCALL(name) and RSYSCALL(name) export "__sys_name" as a strong
symbol, with "_name" and "name" as weak aliases.
PSEUDO() is just like RSYSCALL(), but skipping the "name" weak alias. It
still does "__sys_name" and "_name".
Change i386 to add errno handling to PSEUDO. The same for amd64 and
sparc64, with appear to have copied the behavior.
ia64 was correct (as was alpha). Just remove some apparently unused
variants of the macros. (untested!)
I believe powerpc is correct.
Fix arm to not export "name" from the PSEUDO case. Remove apparently
extra unused variants. (untested!)
The errno problem manifested on i386/amd64/sparc64 by having "PSEUDO"
classified syscalls return without setting errno. eg: "addr = mmap()"
could return with "addr" = 22 instead of setting errno to 22 and
returning -1.
Approved by: re (kensmith)
net: endhostdnsent is named _endhostdnsent and is
private to netdb family of functions.
posix1e: acl_size.c has been never compiled in,
so there's no "acl_size".
rpc: "getnetid" is a static function.
stdtime: "gtime" is #ifdef'ed out in the source.
some symbols are specific only to some architectures,
e.g., ___tls_get_addr is only defined on i386.
__htonl, __htons, __ntohl and __ntohs are no longer
functions, they are now (internal) defines in
<machine/endian.h>.
Submitted by: ru
we can find another way to issue an #error, but using a preprocessed
assembler for that purpose and clobbering libc.a with an empty .o
just for the sake of #error reporting is way too much of a burden.