18 Commits

Author SHA1 Message Date
Andrey A. Chernov
367ed4e13d The problem is: currently our single byte ctype(3) functions are broken
for wide characters locales in the argument range >= 0x80 - they may
return false positives.

Example 1: for UTF-8 locale we currently have:
iswspace(0xA0)==1 and isspace(0xA0)==1
(because iswspace() and isspace() are the same code)
but must have
iswspace(0xA0)==1 and isspace(0xA0)==0
(because there is no such character and all others in the range
0x80..0xff for the UTF-8 locale, it keeps ASCII only in the single byte
range because our internal wchar_t representation for UTF-8 is UCS-4).

Example 2: for all wide character locales isalpha(arg) when arg > 0xFF may
return false positives (must be 0).
(because iswalpha() and isalpha() are the same code)

This change address this issue separating single byte and wide ctype
and also fix iswascii() (currently iswascii() is broken for
arguments > 0xFF).
This change is 100% binary compatible with old binaries.

Reviewied by: i18n@
2007-10-13 16:28:22 +00:00
Alexey Zelkin
e94c6cb4a2 . Static'ize functions exported via function reference variables only.
. Replace inclusion of sys/param.h to sys/cdefs.h and sys/types.h where
  appropriate.
. move _*_init() prototypes to mblocal.h, and remove these prototypes
  from .c files
. use _none_init() in __setrunelocale() instead of duplicating code
. move __mb* variables from table.c to none.c allowing us to not to
  export _none_*() externs, and appropriately remove them from mblocal.h

Ok'ed by:	tjr
2005-02-27 15:11:09 +00:00
Tim J. Robbins
b666b593eb Use a simpler, faster buffering scheme for partial characters in mbrtowc(). 2004-05-14 15:40:47 +00:00
Tim J. Robbins
2051a8f2d5 Move prototypes of various encoding-related functions into a new header
file to avoid extern'ing them all over the place.
2004-05-12 14:09:04 +00:00
Tim J. Robbins
88af941a73 In the absence of proper validation, at least check that null bytes
do not appear as anything but the first byte of a multibyte character.
2004-05-11 14:08:22 +00:00
Tim J. Robbins
fc813796d2 Perform some basic validation of multibyte conversion state objects. 2004-04-12 13:09:18 +00:00
Tim J. Robbins
fa02ee78c8 Don't cast away const qualifiers.
Spotted by:	bde
2004-04-10 00:27:52 +00:00
Tim J. Robbins
ca2dae426e Allow partial multibyte characters to accumulate in conversion state
objects passed to mbrtowc(), mbsrtowcs(), and mbrlen(), as required
by C99.
2004-04-07 10:48:19 +00:00
Tim J. Robbins
9e0bd333f0 Remove unused #includes. 2003-11-08 02:58:37 +00:00
Tim J. Robbins
02f4f60ad5 Convert the Big5, EUC, MSKanji and UTF-8 encoding methods to implement
mbrtowc() and wcrtomb() directly. GB18030, GBK and UTF2 are left
unconverted; GB18030 will be done eventually, but GBK and UTF2 may just
be removed, as they are subsets of GB18030 and UTF-8 respectively.
2003-11-02 10:09:33 +00:00
Tim J. Robbins
0b78986fe2 FA, FB and FC are lead bytes according to recent Microsoft documentation. 2002-10-14 01:50:45 +00:00
Tim J. Robbins
d891f26821 Style changes. Mainly removing excessive whitespace and parens. 2002-10-14 01:46:18 +00:00
David E. O'Brien
333fc21e3c Fix the style of the SCM ID's.
I believe have made all of libc .c's as consistent as possible.
2002-03-22 21:53:29 +00:00
David E. O'Brien
c05ac53b8b Remove __P() usage. 2002-03-21 22:49:10 +00:00
Daniel Eischen
d201fe46e3 Remove _THREAD_SAFE and make libc thread-safe by default by
adding (weak definitions to) stubs for some of the pthread
functions.  If the threads library is linked in, the real
pthread functions will pulled in.

Use the following convention for system calls wrapped by the
threads library:
	__sys_foo - actual system call
	_foo - weak definition to __sys_foo
	foo - weak definition to __sys_foo

Change all libc uses of system calls wrapped by the threads
library from foo to _foo.  In order to define the prototypes
for _foo(), we introduce namespace.h and un-namespace.h
(suggested by bde).  All files that need to reference these
system calls, should include namespace.h before any standard
includes, then include un-namespace.h after the standard
includes and before any local includes.  <db.h> is an exception
and shouldn't be included in between namespace.h and
un-namespace.h  namespace.h will define foo to _foo, and
un-namespace.h will undefine foo.

Try to eliminate some of the recursive calls to MT-safe
functions in libc/stdio in preparation for adding a mutex
to FILE.  We have recursive mutexes, but would like to avoid
using them if possible.

Remove uneeded includes of <errno.h> from a few files.

Add $FreeBSD$ to a few files in order to pass commitprep.

Approved by:	-arch
2001-01-24 13:01:12 +00:00
Andrey A. Chernov
8b96e6c916 Megre XPG4 code into libc 2000-06-03 12:24:08 +00:00
Andrey A. Chernov
3f2fd98c12 Move it under XPG4 define 1997-09-25 23:20:26 +00:00
Julian Elischer
16f76e6f06 Submitted by: Sin'ichiro MIYATANI / Phase One, Inc <siu@phaseone.co.jp>
Basic support for the Shift JIS encoding of japanese.
(and one tiny typo fixed in a comment)
1997-09-24 20:38:12 +00:00