Commit Graph

23 Commits

Author SHA1 Message Date
theraven
0f6ef690b3 Implement xlocale APIs from Darwin, mainly for use by libc++. This adds a
load of _l suffixed versions of various standard library functions that use
the global locale, making them take an explicit locale parameter.  Also
adds support for per-thread locales.  This work was funded by the FreeBSD
Foundation.

Please test any code you have that uses the C standard locale functions!

Reviewed by:    das (gdtoa changes)
Approved by:    dim (mentor)
2011-11-20 14:45:42 +00:00
ache
a5038f060d The problem is: currently our single byte ctype(3) functions are broken
for wide characters locales in the argument range >= 0x80 - they may
return false positives.

Example 1: for UTF-8 locale we currently have:
iswspace(0xA0)==1 and isspace(0xA0)==1
(because iswspace() and isspace() are the same code)
but must have
iswspace(0xA0)==1 and isspace(0xA0)==0
(because there is no such character and all others in the range
0x80..0xff for the UTF-8 locale, it keeps ASCII only in the single byte
range because our internal wchar_t representation for UTF-8 is UCS-4).

Example 2: for all wide character locales isalpha(arg) when arg > 0xFF may
return false positives (must be 0).
(because iswalpha() and isalpha() are the same code)

This change address this issue separating single byte and wide ctype
and also fix iswascii() (currently iswascii() is broken for
arguments > 0xFF).
This change is 100% binary compatible with old binaries.

Reviewied by: i18n@
2007-10-13 16:28:22 +00:00
phantom
23d961a13f . Static'ize functions exported via function reference variables only.
. Replace inclusion of sys/param.h to sys/cdefs.h and sys/types.h where
  appropriate.
. move _*_init() prototypes to mblocal.h, and remove these prototypes
  from .c files
. use _none_init() in __setrunelocale() instead of duplicating code
. move __mb* variables from table.c to none.c allowing us to not to
  export _none_*() externs, and appropriately remove them from mblocal.h

Ok'ed by:	tjr
2005-02-27 15:11:09 +00:00
tjr
d04fd4700f Prefix the names of members of _RuneLocale and its sub-structures
with ``__'' to avoid polluting the namespace. This doesn't change the
documented rune interface at all, but breaks applications that accessed
_RuneLocale directly.
2004-06-23 07:01:44 +00:00
tjr
fb60260f98 Buffer partial wide characters more efficiently: instead of storing the
multibyte representation in conversion state objects, store the
accumulated wide character, set number and number of bytes remaining
to avoid having to derive them every time mbrtowc() is called.
2004-05-27 10:54:34 +00:00
tjr
e3f042f4af Move prototypes of various encoding-related functions into a new header
file to avoid extern'ing them all over the place.
2004-05-12 14:09:04 +00:00
tjr
d79e71957e In the absence of proper validation, at least check that null bytes
do not appear as anything but the first byte of a multibyte character.
2004-05-11 14:08:22 +00:00
tjr
8f8a2ad179 Perform some basic validation of multibyte conversion state objects. 2004-04-12 13:09:18 +00:00
tjr
17077e5ae6 Don't cast away const qualifiers.
Spotted by:	bde
2004-04-10 00:27:52 +00:00
tjr
54a18fa1d6 Allow partial multibyte characters to accumulate in conversion state
objects passed to mbrtowc(), mbsrtowcs(), and mbrlen(), as required
by C99.
2004-04-07 10:48:19 +00:00
tjr
866579d246 Remove unused #includes. 2003-11-08 02:58:37 +00:00
tjr
1c3a3f7e26 Convert the Big5, EUC, MSKanji and UTF-8 encoding methods to implement
mbrtowc() and wcrtomb() directly. GB18030, GBK and UTF2 are left
unconverted; GB18030 will be done eventually, but GBK and UTF2 may just
be removed, as they are subsets of GB18030 and UTF-8 respectively.
2003-11-02 10:09:33 +00:00
ache
9d73d0dd12 Add safeguards to never use errno == 0 as setrunelocale() error return code 2002-08-09 08:22:29 +00:00
ache
3b0ddae36e Rewrite locale loading procedures, so any load failure will not affect
currently cached data.  It allows a number of nice things, like: removing
fallback code from single locale loading, remove memory leak when LC_CTYPE
data loaded again and again, efficient cache use, not only for
setlocale(locale1); setlocale(locale1), but for setlocale(locale1);
setlocale("C"); setlocale(locale1) too (i.e.  data file loaded only once).
2002-08-08 05:51:54 +00:00
ache
1994aec49d Fix wrong address when EucInfo > "variable" size 2002-08-07 20:20:56 +00:00
asmodai
3b69e0094c Remove the hard-coded limit of 3 bytes for EUC encodings.
Satoshi NIIMI-san kindly explained that EUC does not limit the byte length to
any arbitrary number.

We now set the limit to the maximum octet length of the codeset and it is
locale-specific.

Submitted by:	Yong-Jhen Hong <winard@ms11.url.com.tw>
2002-04-14 10:55:42 +00:00
asmodai
58d9ddfc7d Fix EUC encoding conversion for codeset 3 and 4 to comply to the specification.
PR:		28552
Submitted by:	NIIMI Satoshi <sa2c@and.or.jp>
2002-04-07 16:37:15 +00:00
obrien
d90536e35b Fix the style of the SCM ID's.
I believe have made all of libc .c's as consistent as possible.
2002-03-22 21:53:29 +00:00
obrien
3b73ce2319 Remove __P() usage. 2002-03-21 22:49:10 +00:00
ache
50dddc0919 Megre XPG4 code into libc 2000-06-03 12:24:08 +00:00
jb
bde8299706 Include string.h for memcpy function prototype. 1998-01-14 08:14:56 +00:00
ache
6ee0412bd8 Migrate from XPG4 to XPG3 (libxpg4 will be added soon)
Remove big part of my startup_setlocale hack.
Add missing manpage links.
1995-10-23 01:34:17 +00:00
rgrimes
be22b15ae2 BSD 4.4 Lite Lib Sources 1994-05-27 05:00:24 +00:00