causing mb* functions (and similar) to be called with the wrong data
(possibly a null pointer, causing a crash).
PR: standards/188036
MFC after: 1 week
- Address performance regressions encountered by das@ by caching per-thread
data in TLS where available.
- Add a __NO_TLS flag to cdefs.h to indicate where not available.
- Reorganise the xlocale.h definitions into xlocale/*.h so that they can be
included from multiple places.
- Export the POSIX2008 subset of xlocale when POSIX2008 says it should be
exported, independently of whether xlocale.h is included.
- Fix the bug where programs using ctype functions always assumed ASCII unless
recompiled.
- Fix some style(9) violations.
Reviewed by: brooks (mentor)
Approved by: dim (mentor)
load of _l suffixed versions of various standard library functions that use
the global locale, making them take an explicit locale parameter. Also
adds support for per-thread locales. This work was funded by the FreeBSD
Foundation.
Please test any code you have that uses the C standard locale functions!
Reviewed by: das (gdtoa changes)
Approved by: dim (mentor)
for wide characters locales in the argument range >= 0x80 - they may
return false positives.
Example 1: for UTF-8 locale we currently have:
iswspace(0xA0)==1 and isspace(0xA0)==1
(because iswspace() and isspace() are the same code)
but must have
iswspace(0xA0)==1 and isspace(0xA0)==0
(because there is no such character and all others in the range
0x80..0xff for the UTF-8 locale, it keeps ASCII only in the single byte
range because our internal wchar_t representation for UTF-8 is UCS-4).
Example 2: for all wide character locales isalpha(arg) when arg > 0xFF may
return false positives (must be 0).
(because iswalpha() and isalpha() are the same code)
This change address this issue separating single byte and wide ctype
and also fix iswascii() (currently iswascii() is broken for
arguments > 0xFF).
This change is 100% binary compatible with old binaries.
Reviewied by: i18n@
. Replace inclusion of sys/param.h to sys/cdefs.h and sys/types.h where
appropriate.
. move _*_init() prototypes to mblocal.h, and remove these prototypes
from .c files
. use _none_init() in __setrunelocale() instead of duplicating code
. move __mb* variables from table.c to none.c allowing us to not to
export _none_*() externs, and appropriately remove them from mblocal.h
Ok'ed by: tjr
convenient when the source string isn't null-terminated.
Implement the other conversion functions (mbstowcs(), mbsrtowcs(), wcstombs(),
wcsrtombs()) in terms of these new functions.
with ``__'' to avoid polluting the namespace. This doesn't change the
documented rune interface at all, but breaks applications that accessed
_RuneLocale directly.
as wrappers around the deprecated 4.4BSD rune functions. This paves the
way for state-dependent encodings, which the rune API does not support.
- Add __emulated_sgetrune() and __emulated_sputrune(), which are
implementations of sgetrune() and sputrune() in terms of
mbrtowc() and wcrtomb().
- Rename the old rune-wrapper mbrtowc() and wcrtomb() functions to
__emulated_mbrtowc() and __emulated_wcrtomb().
- Add __mbrtowc and __wcrtomb function pointers, which point to the
current locale's conversion functions, or the __emulated versions.
- Implement mbrtowc() and wcrtomb() as calls to these function pointers.
- Make the "NONE" encoding implement mbrtowc() and wcrtomb() directly.
All of this emulation mess will be removed, together with rune support,
in FreeBSD 6.
"UTF2" method. Although UTF-8 and the old UTF2 encoding are compatible
for 16-bit characters, the new UTF-8 implementation is much more strict
about rejecting malformed input and also handles the full 31 bit range
of characters.
currently cached data. It allows a number of nice things, like: removing
fallback code from single locale loading, remove memory leak when LC_CTYPE
data loaded again and again, efficient cache use, not only for
setlocale(locale1); setlocale(locale1), but for setlocale(locale1);
setlocale("C"); setlocale(locale1) too (i.e. data file loaded only once).
the diff is attached below. This is done on the 3.0 source-tree.
I have test this on 2.2-stable before, but I don't have a 3.0 machine
right now.
This patch is mainly to make libc support BIG5 encoding, thus add
zh_TW.BIG5 locale to 3.0.
Submitted by: Chen Hsiung Chan <frankch@waru.life.nthu.edu.tw>