On anything modern, the C version, which processes a word at a time, is much
faster. The Intel optimization manual explicitly warns against using REP
prefixes with SCAS or CMPS, which is exactly what the assembler version
does.
A simple test on a Phenom II showed the C version, compiled with -O2, to be
about twice as fast determining the length of 100000 strings between 0 and
255 bytes long.
MFC after: 2 weeks
as they are slower than the generic version in C, at least on modern
hardware. This leaves us with just five implementations.
Suggested by: bde
Approved by: rpaulo (mentor)
Reorder inline assembly arguments temp2, temp, value and texp to follow
the st(0), st(1), etc. style.
Also mark the temp2 variable as volatile to workaround another clang
bug.
This allows clang to buildworld FreeBSD/i386.
Submitted by: dim
Looking at our source code history, it seems the uname(),
getdomainname() and setdomainname() system calls got deprecated
somewhere after FreeBSD 1.1, but they have never been phased out
properly. Because we don't have a COMPAT_FREEBSD1, just use
COMPAT_FREEBSD4.
Also fix the Linuxolator to build without the setdomainname() routine by
just making it call userland_sysctl on kern.domainname. Also replace the
setdomainname()'s implementation to use this approach, because we're
duplicating code with sysctl_domainname().
I wasn't able to keep these three routines working in our
COMPAT_FREEBSD32, because that would require yet another keyword for
syscalls.master (COMPAT4+NOPROTO). Because this routine is probably
unused already, this won't be a problem in practice. If it turns out to
be a problem, we'll just restore this functionality.
Reviewed by: rdivacky, kib
instead of 32+32+15+1) on all arches that have such long doubles (amd64,
ia64 and i386). Large objects should be be accessed in large units,
and the 32+32+15+1[+padding] decomposition asks for almost the opposite
of that, sometimes resulting in very slow accesses depending on how
well the compiler ignores what we ask for and converts to the best
units for the given machine. E.g., on Athlons, there is a 10-20 cycle
penalty for accessing the middle 32-bit word immediately after an
80-bit store.
Whether actually using the alternative view is better is very machine-
dependent. A 32+32+16 view is probably best with old 32-bit systems
and gcc through 4.2.1. The compiler should mostly avoid the view and
generate best accesses, but gcc-4.2.1 is far from doing that. I think
64+16 is best for now. Similarly for doubles -- they should be using
64+0 especially on 64-bit machines, but fdlibm uses 32+32 extensively
for them. Fortunately, in 64-bit mode for doubles, gcc already ignores
the 32+32-bit view and generates best accesses in many cases.
my original implementation made both use the same code. Unfortunately,
this meant libm depended on a vendor header at compile time and previously-
unexposed vendor bits in libc at runtime.
Hence, I just wrote my own version of the relevant vendor routine. As it
turns out, mine has a factor of 8 fewer of lines of code, and is a bit more
readable anyway. The strtod() and *scanf() routines still use vendor code.
Reviewed by: bde
syscalls, unless WITHOUT_SYSCALL_COMPAT is defined. The default case
will have the .c wrappers still. If you define WITHOUT_SYSCALL_COMPAT,
the .c wrappers will go away and libc will make direct syscalls.
After 7-stable starts, the direct syscall method will be default.
Approved by: re (kensmith)
particular:
SYSCALL() makes a syscall, with errno handling, and continues execution
directly after the macro in the non-error case.
RSYSCALL() is just like SYSCALL(), but returns after success.
Both SYSCALL(name) and RSYSCALL(name) export "__sys_name" as a strong
symbol, with "_name" and "name" as weak aliases.
PSEUDO() is just like RSYSCALL(), but skipping the "name" weak alias. It
still does "__sys_name" and "_name".
Change i386 to add errno handling to PSEUDO. The same for amd64 and
sparc64, with appear to have copied the behavior.
ia64 was correct (as was alpha). Just remove some apparently unused
variants of the macros. (untested!)
I believe powerpc is correct.
Fix arm to not export "name" from the PSEUDO case. Remove apparently
extra unused variants. (untested!)
The errno problem manifested on i386/amd64/sparc64 by having "PSEUDO"
classified syscalls return without setting errno. eg: "addr = mmap()"
could return with "addr" = 22 instead of setting errno to 22 and
returning -1.
Approved by: re (kensmith)
net: endhostdnsent is named _endhostdnsent and is
private to netdb family of functions.
posix1e: acl_size.c has been never compiled in,
so there's no "acl_size".
rpc: "getnetid" is a static function.
stdtime: "gtime" is #ifdef'ed out in the source.
some symbols are specific only to some architectures,
e.g., ___tls_get_addr is only defined on i386.
__htonl, __htons, __ntohl and __ntohs are no longer
functions, they are now (internal) defines in
<machine/endian.h>.
Submitted by: ru
we can find another way to issue an #error, but using a preprocessed
assembler for that purpose and clobbering libc.a with an empty .o
just for the sake of #error reporting is way too much of a burden.
Like on libthr, there is an i386_set_gsbase() stub implementation here
to avoid libc.so.5 issues. This should likely be a weak symbol and I
expect this will be fixed soon.
Approved by: re
bit in a long double. For architectures that don't have such a bit,
LDBL_NBIT is 0. This makes it possible to say `mantissa & ~LDBL_NBIT'
in places that previously used an #ifdef to select the right expression.
The optimizer should dispense with the extra arithmetic when LDBL_NBIT
is 0.
can't use the i386_set_ldt() family of routines, because they are not
implemented. Instead, use the recently exposed direct access sysarch
routines for setting what %fs and %gs point to.
Use this for the i386 TLS _set_tp() routine, but only when compiling to
run as a 32 bit support binary for amd64 kernels.