freebsd-dev

Author	SHA1	Message	Date
Tim J. Robbins	ea9a9a377b	Add UTF-8-specific implementations of mbsnrtowcs() and wcsnrtombs(). These convert plain ASCII characters in-line, making them only slightly slower than the single-byte ("NONE" encoding) version when processing ASCII strings.	2004-07-27 06:29:48 +00:00
Tim J. Robbins	550473de5b	Add fast paths for conversion of plain ASCII characters.	2004-07-09 15:46:06 +00:00
Tim J. Robbins	5e44d7ebe1	Use conversion state objects to store the accumulated wide character, low bound, and the number of bytes remaining instead of storing the raw byte sequence and deriving them every time mbrtowc() is called. This is much faster -- about twice as fast in some crude benchmarks.	2004-05-17 12:32:40 +00:00
Tim J. Robbins	2051a8f2d5	Move prototypes of various encoding-related functions into a new header file to avoid extern'ing them all over the place.	2004-05-12 14:09:04 +00:00
Tim J. Robbins	fc813796d2	Perform some basic validation of multibyte conversion state objects.	2004-04-12 13:09:18 +00:00
Tim J. Robbins	fa02ee78c8	Don't cast away const qualifiers. Spotted by: bde	2004-04-10 00:27:52 +00:00
Tim J. Robbins	ca2dae426e	Allow partial multibyte characters to accumulate in conversion state objects passed to mbrtowc(), mbsrtowcs(), and mbrlen(), as required by C99.	2004-04-07 10:48:19 +00:00
Tim J. Robbins	b1c572ad5b	Fix a typo that caused mbrtowc() to always return 0.	2003-11-11 07:25:05 +00:00
Tim J. Robbins	02f4f60ad5	Convert the Big5, EUC, MSKanji and UTF-8 encoding methods to implement mbrtowc() and wcrtomb() directly. GB18030, GBK and UTF2 are left unconverted; GB18030 will be done eventually, but GBK and UTF2 may just be removed, as they are subsets of GB18030 and UTF-8 respectively.	2003-11-02 10:09:33 +00:00
Jacques Vidrine	6d7bd75a4e	Whack 28 unused variables.	2003-02-18 13:39:52 +00:00
Tim J. Robbins	972baa3747	Add a UTF-8 encoding method, which will eventually replace the antique "UTF2" method. Although UTF-8 and the old UTF2 encoding are compatible for 16-bit characters, the new UTF-8 implementation is much more strict about rejecting malformed input and also handles the full 31 bit range of characters.	2002-10-10 22:56:18 +00:00

11 Commits