The US-ASCII format was getting treated identically to POSIX. It is
supposed to throw an ILSEQ errno if a value of 0x80 or greater is
encountered, so let's bring back the "ASCII" handling.
While here, change nl_codeset to return US-ASCII only when the encoding
really is "US-ASCII". Before "C" and "POSIX" encoding returned this
string, so now they return "POSIX".
Discussed with: ache
Submitted by: marino
Obtained from: DragonflyBSD
if not already defined. This allows building libc from outside of
lib/libc using a reach-over makefile.
A typical use-case is to build a standard ILP32 version and a COMPAT32
version in a single iteration by building the COMPAT32 version using a
reach-over makefile.
Obtained from: Juniper Networks, Inc.
I initially thought wchar_t was locale independent, but this seems to be
only the case on Linux. This means that we cannot depend on the *wc*()
routines to implement *c16*() and *c32*(). Instead, use the Citrus
libiconv that is part of libc.
I'll see if there is anything I can do to make the existing functions
somewhat useful in case the system is built without libiconv in the
nearby future. If not, I'll simply remove the broken implementations.
Reviewed by: jilles, gabor
The <uchar.h> header, part of C11, adds a small number of utility
functions for 16/32-bit "universal" characters, which may or may not be
UTF-16/32. As our wchar_t is already ISO 10646, simply add light-weight
wrappers around wcrtomb() and mbrtowc().
While there, also add (non-yet-standard) _l functions, similar to the
ones we already have for the other locale-dependent functions.
Reviewed by: theraven
load of _l suffixed versions of various standard library functions that use
the global locale, making them take an explicit locale parameter. Also
adds support for per-thread locales. This work was funded by the FreeBSD
Foundation.
Please test any code you have that uses the C standard locale functions!
Reviewed by: das (gdtoa changes)
Approved by: dim (mentor)
their implementations aren't in the same files. Introduce LIBC_ARCH
and use that in preference to MACHINE_CPUARCH. Tested by amd64 and
powerpc64 builds (thanks nathanw@)
GNU) for determining whether a string is an affirmative or negative
response to a question according to the current locale. This is done
by matching the response against nl_langinfo(3) items YESEXPR and NOEXPR.
convenient when the source string isn't null-terminated.
Implement the other conversion functions (mbstowcs(), mbsrtowcs(), wcstombs(),
wcsrtombs()) in terms of these new functions.
class. This is necessary in order to implement tr(1) efficiently in
multibyte locales, since the brute force method of finding all characters
in a class is infeasible with a 32-bit (or wider) wchar_t.
Instead of just deleting it, turn the original page into a general
overview of the multibyte character conversion functions, somewhat
similar to stdio(3).
as wrappers around the deprecated 4.4BSD rune functions. This paves the
way for state-dependent encodings, which the rune API does not support.
- Add __emulated_sgetrune() and __emulated_sputrune(), which are
implementations of sgetrune() and sputrune() in terms of
mbrtowc() and wcrtomb().
- Rename the old rune-wrapper mbrtowc() and wcrtomb() functions to
__emulated_mbrtowc() and __emulated_wcrtomb().
- Add __mbrtowc and __wcrtomb function pointers, which point to the
current locale's conversion functions, or the __emulated versions.
- Implement mbrtowc() and wcrtomb() as calls to these function pointers.
- Make the "NONE" encoding implement mbrtowc() and wcrtomb() directly.
All of this emulation mess will be removed, together with rune support,
in FreeBSD 6.
"UTF2" method. Although UTF-8 and the old UTF2 encoding are compatible
for 16-bit characters, the new UTF-8 implementation is much more strict
about rejecting malformed input and also handles the full 31 bit range
of characters.