stefanf
5b6654bdf6
Prefer C99's __func__ over GCC's __FUNCTION__.
2004-09-22 16:56:49 +00:00
tjr
9cf5fdf194
Re-word warning about the UTF2 encoding, taking care to use the word
...
"obsolete" instead of "deprecated".
2004-08-21 08:08:29 +00:00
tjr
7c805bebcd
Bump document date for previous.
2004-08-21 08:03:18 +00:00
tjr
4fe778a081
Re-word warning about the rune interface, taking care to use the word
...
"obsolete" instead of "deprecated".
2004-08-21 08:00:31 +00:00
tjr
0950eb6cba
Change "deprecated" in link-time warnings about various rune functions
...
to "obsolete".
2004-08-21 07:48:06 +00:00
tjr
a8cee78a82
Re-word compatibility section, taking care to use the word "obsolete" to
...
describe the 4.4BSD extension of accepting characters (runes) outside of
the range of unsigned char.
2004-08-21 07:37:08 +00:00
trhodes
05b6eacd90
/me kicks cvs update
...
Revert previous commit, tjr already fixed it and I was too stupid to
notice this fact.
Approved by: re (to avoid failing cvs ci)
2004-08-17 04:56:03 +00:00
trhodes
d788381502
Fix incorrect code in an example. The previous example would produce
...
19 column positions wide in the first line and 20 in the rest of the lines.
This fixes the example to provide the correct output.
PR: 53454
Noticed by: Kuang-che Wu <kcwu@kcwu.homeip.net>
Submitted by: Marc Silver <marcs@draenor.org>
Approved by: re (scottl)
2004-08-17 04:45:52 +00:00
tjr
a1081fe738
Fix example.
2004-08-12 12:32:14 +00:00
tjr
84b5d3520f
Implement wcwidth() as an inline function.
2004-08-12 12:19:11 +00:00
tjr
24ab237a89
Re-word the COMPATIBILITY section, taking care to use the word "deprecated"
...
to describe the 4.4BSD extension of accepting arguments outside the range
of unsigned char. This gives us freedom to remove this extension when we
remove the <rune.h> interface in FreeBSD 6.
2004-07-29 23:32:41 +00:00
tjr
786e3d397c
Remove unnecessary #include directives.
2004-07-29 06:18:40 +00:00
tjr
45e69ebea9
Prefer <runetype.h> to <rune.h>, since the latter is going away soon.
2004-07-29 06:16:19 +00:00
tjr
922ba3746b
Remove useless checks for characters longer than INT_MAX bytes.
2004-07-29 06:08:31 +00:00
tjr
b9fa8ef024
Add UTF-8-specific implementations of mbsnrtowcs() and wcsnrtombs().
...
These convert plain ASCII characters in-line, making them only slightly
slower than the single-byte ("NONE" encoding) version when processing
ASCII strings.
2004-07-27 06:29:48 +00:00
tjr
7108a0ff8a
Return the correct value when dst == NULL and conversion has stopped after
...
nwc dropping to zero.
2004-07-22 02:57:29 +00:00
tjr
5b4f25c6e9
Implement the GNU extensions of mbsnrtowcs() and wcsnrtombs(). These are
...
convenient when the source string isn't null-terminated.
Implement the other conversion functions (mbstowcs(), mbsrtowcs(), wcstombs(),
wcsrtombs()) in terms of these new functions.
2004-07-21 10:54:57 +00:00
tjr
0bea5c0108
Add fast paths for conversion of plain ASCII characters.
2004-07-09 15:46:06 +00:00
tjr
3a9d81b253
Add a function to iterate over all characters in a particular character
...
class. This is necessary in order to implement tr(1) efficiently in
multibyte locales, since the brute force method of finding all characters
in a class is infeasible with a 32-bit (or wider) wchar_t.
2004-07-08 06:43:37 +00:00
ru
b5e1c67f19
Markup nits.
2004-07-05 06:39:03 +00:00
ru
6651f20e0d
Sort SEE ALSO references (in dictionary order, ignoring case).
2004-07-04 20:55:50 +00:00
ru
01548ace15
Mechanically kill hard sentence breaks.
2004-07-02 23:52:20 +00:00
ru
4b39413aeb
Removed trailing whitespace.
2004-07-02 19:07:33 +00:00
ru
95168a499a
Markup, grammar, and spelling fixes.
2004-06-30 20:09:10 +00:00
ru
6ad65dd7e0
Fixed a typo.
2004-06-30 19:32:41 +00:00
tjr
d04fd4700f
Prefix the names of members of _RuneLocale and its sub-structures
...
with ``__'' to avoid polluting the namespace. This doesn't change the
documented rune interface at all, but breaks applications that accessed
_RuneLocale directly.
2004-06-23 07:01:44 +00:00
mpp
98d43ce6f1
Spelling fixes.
2004-06-21 19:54:56 +00:00
tjr
fb60260f98
Buffer partial wide characters more efficiently: instead of storing the
...
multibyte representation in conversion state objects, store the
accumulated wide character, set number and number of bytes remaining
to avoid having to derive them every time mbrtowc() is called.
2004-05-27 10:54:34 +00:00
tjr
0efcf2d09b
Scan the source string for invalid wide characters in wcsrtombs()
...
in the dst == NULL case.
2004-05-25 10:45:24 +00:00
tjr
ea28a65744
Grab all the information we need about a character with one call to
...
__maskrune() instead of one direct call and one through iswprint().
2004-05-23 13:20:09 +00:00
tjr
aee8349a0c
Use conversion state objects to store the accumulated wide character,
...
low bound, and the number of bytes remaining instead of storing the
raw byte sequence and deriving them every time mbrtowc() is called.
This is much faster -- about twice as fast in some crude benchmarks.
2004-05-17 12:32:40 +00:00
tjr
b40b6a2d84
Use a simpler and faster buffering scheme for partial multibyte characters.
2004-05-17 11:16:14 +00:00
tjr
9e176d6b08
Use a simpler, faster buffering scheme for partial characters in mbrtowc().
2004-05-14 15:40:47 +00:00
tjr
3aa9288a48
Allow encoding modules to override the default implementations of
...
mbsrtowcs() and wcsrtombs(). Provide a fast implementation for the
trivial "NONE" encoding.
2004-05-13 11:20:27 +00:00
tjr
e442306798
Fix braino in previous: check that the second byte in the character
...
buffer is non-null when the character is two bytes long, not when
the buffer is two bytes long.
2004-05-13 03:08:28 +00:00
tjr
5ad27cd64f
Reduce overhead by calling internal versions of the multibyte conversion
...
functions directly wherever possible.
2004-05-12 14:26:54 +00:00
tjr
e3f042f4af
Move prototypes of various encoding-related functions into a new header
...
file to avoid extern'ing them all over the place.
2004-05-12 14:09:04 +00:00
tjr
d79e71957e
In the absence of proper validation, at least check that null bytes
...
do not appear as anything but the first byte of a multibyte character.
2004-05-11 14:08:22 +00:00
tjr
a8117b04ca
Use a binary search to find the range containing a character in
...
RuneRange arrays. This is much faster when there are hundreds of
ranges (as is the case in UTF-8 locales) and was inspired by a
similar change made by Apple in Darwin.
2004-05-09 13:04:49 +00:00
ache
a7c84134a6
Rewrite split_lines() to operate safely
...
PR: 62694
Submitted by: moulin p <moulin.p@calyopea.com>
2004-04-25 19:56:50 +00:00
tjr
8f8a2ad179
Perform some basic validation of multibyte conversion state objects.
2004-04-12 13:09:18 +00:00
tjr
e26da574d5
Remove a nonsensical remark about byte order markers in UTF-8 streams.
2004-04-12 12:58:41 +00:00
tjr
e768e0d54f
Document the meaning of the zero return value.
2004-04-11 05:19:19 +00:00
davidxu
289170412b
Fix a typo. I was locked out for two days from my machine.
2004-04-10 14:36:57 +00:00
tjr
17077e5ae6
Don't cast away const qualifiers.
...
Spotted by: bde
2004-04-10 00:27:52 +00:00
tjr
0ca2900d48
Update manual pages for change to C99 mbrtowc() semantics.
2004-04-08 09:59:02 +00:00
tjr
54a18fa1d6
Allow partial multibyte characters to accumulate in conversion state
...
objects passed to mbrtowc(), mbsrtowcs(), and mbrlen(), as required
by C99.
2004-04-07 10:48:19 +00:00
tjr
226e976dd7
Begin conversions for sgetrune() and sputrune() in the initial
...
conversion state.
2004-04-07 09:49:10 +00:00
tjr
47b6d3f343
Prepare to handle state-dependent encodings. This mainly involves not
...
taking shortcuts when it comes to storing and passing around conversion
states.
2004-04-07 09:47:56 +00:00
tjr
c3bbcd6ef6
Begin in the initial shift state in mbstowcs() and wcstombs().
...
(This change is non-functional since nothing uses states yet.)
2004-04-07 08:33:23 +00:00