idea is that we perform multibyte->wide character conversion while parsing
and compiling, then convert byte sequences to wide characters when they're
needed for comparison and stepping through the string during execution.
As with tr(1), the main complication is to efficiently represent sets of
characters in bracket expressions. The old bitmap representation is replaced
by a bitmap for the first 256 characters combined with a vector of individual
wide characters, a vector of character ranges (for [A-Z] etc.), and a vector
of character classes (for [[:alpha:]] etc.).
One other point of interest is that although the Boyer-Moore algorithm had
to be disabled in the general multibyte case, it is still enabled for UTF-8
because of its self-synchronizing nature. This greatly speeds up matching
by reducing the number of multibyte conversions that need to be done.
consequently the exponent is only 11 bits. Testing whether the
exponent equals 32767 in that case only effects to compiler warnings
and thus build breakage.
isnormal() the hard way, rather than relying on fpclassify(). This is
a lose in the sense that we need a total of 12 functions, but it is
necessary for binary compatibility because we have never bumped libm's
major version number. In particular, isinf(), isnan(), and isnanf()
were BSD libc functions before they were C99 macros, so we can't
reimplement them in terms of fpclassify() without adding a dependency
on libc.so.5. I have tried to arrange things so that programs that
could be compiled in FreeBSD 4.X will generate the same external
references when compiled in 5.X. At the same time, the new macros
should remain C99-compliant.
The isinf() and isnan() functions remain in libc for historical
reasons; however, I have moved the functions that implement the macros
isfinite() and isnormal() to libm where they belong. Moreover,
half a dozen MD versions of isinf() and isnan() have been replaced
with MI versions that work equally well.
Prodded by: kris
class. This is necessary in order to implement tr(1) efficiently in
multibyte locales, since the brute force method of finding all characters
in a class is infeasible with a 32-bit (or wider) wchar_t.
under the RETURN VALUES section so it is consistent with others.
Cleanup the return value text for getenv(3) a little while I am here.
PR: docs/58033
MFC after: 3 days
with ``__'' to avoid polluting the namespace. This doesn't change the
documented rune interface at all, but breaks applications that accessed
_RuneLocale directly.
This is a corresponding change to bin/67994. I'll soon commit
bin/67994 into 4-STABLE. Actually, 5-CURRENT's getaddrinfo()
doesn't have the problem mentiond in bin/67994. However, it is
good to be in sync variable name with 4-STABLE and KAME.
PR: bin/67994
Submitted by: JINMEI Tatuya <jinmei@ocean.jinmei.org>