These convert plain ASCII characters in-line, making them only slightly
slower than the single-byte ("NONE" encoding) version when processing
ASCII strings.
low bound, and the number of bytes remaining instead of storing the
raw byte sequence and deriving them every time mbrtowc() is called.
This is much faster -- about twice as fast in some crude benchmarks.
mbrtowc() and wcrtomb() directly. GB18030, GBK and UTF2 are left
unconverted; GB18030 will be done eventually, but GBK and UTF2 may just
be removed, as they are subsets of GB18030 and UTF-8 respectively.
"UTF2" method. Although UTF-8 and the old UTF2 encoding are compatible
for 16-bit characters, the new UTF-8 implementation is much more strict
about rejecting malformed input and also handles the full 31 bit range
of characters.