freebsd-dev

Author	SHA1	Message	Date
Tim J. Robbins	c05bd9ae25	Buffer partial wide characters more efficiently: instead of storing the multibyte representation in conversion state objects, store the accumulated wide character, set number and number of bytes remaining to avoid having to derive them every time mbrtowc() is called.	2004-05-27 10:54:34 +00:00
Tim J. Robbins	18b2031298	Scan the source string for invalid wide characters in wcsrtombs() in the dst == NULL case.	2004-05-25 10:45:24 +00:00
Tim J. Robbins	675e7ddbee	Grab all the information we need about a character with one call to __maskrune() instead of one direct call and one through iswprint().	2004-05-23 13:20:09 +00:00
Tim J. Robbins	5e44d7ebe1	Use conversion state objects to store the accumulated wide character, low bound, and the number of bytes remaining instead of storing the raw byte sequence and deriving them every time mbrtowc() is called. This is much faster -- about twice as fast in some crude benchmarks.	2004-05-17 12:32:40 +00:00
Tim J. Robbins	6107476759	Use a simpler and faster buffering scheme for partial multibyte characters.	2004-05-17 11:16:14 +00:00
Tim J. Robbins	b666b593eb	Use a simpler, faster buffering scheme for partial characters in mbrtowc().	2004-05-14 15:40:47 +00:00
Tim J. Robbins	ea4ac135ff	Allow encoding modules to override the default implementations of mbsrtowcs() and wcsrtombs(). Provide a fast implementation for the trivial "NONE" encoding.	2004-05-13 11:20:27 +00:00
Tim J. Robbins	f789f94dbb	Fix braino in previous: check that the second byte in the character buffer is non-null when the character is two bytes long, not when the buffer is two bytes long.	2004-05-13 03:08:28 +00:00
Tim J. Robbins	6155c34adf	Reduce overhead by calling internal versions of the multibyte conversion functions directly wherever possible.	2004-05-12 14:26:54 +00:00
Tim J. Robbins	2051a8f2d5	Move prototypes of various encoding-related functions into a new header file to avoid extern'ing them all over the place.	2004-05-12 14:09:04 +00:00
Tim J. Robbins	88af941a73	In the absence of proper validation, at least check that null bytes do not appear as anything but the first byte of a multibyte character.	2004-05-11 14:08:22 +00:00
Tim J. Robbins	45a11576f3	Use a binary search to find the range containing a character in RuneRange arrays. This is much faster when there are hundreds of ranges (as is the case in UTF-8 locales) and was inspired by a similar change made by Apple in Darwin.	2004-05-09 13:04:49 +00:00
Andrey A. Chernov	28aec5a68c	Rewrite split_lines() to operate safely PR: 62694 Submitted by: moulin p <moulin.p@calyopea.com>	2004-04-25 19:56:50 +00:00
Tim J. Robbins	fc813796d2	Perform some basic validation of multibyte conversion state objects.	2004-04-12 13:09:18 +00:00
Tim J. Robbins	c282a0a1ed	Remove a nonsensical remark about byte order markers in UTF-8 streams.	2004-04-12 12:58:41 +00:00
Tim J. Robbins	78c4a3f225	Document the meaning of the zero return value.	2004-04-11 05:19:19 +00:00
David Xu	6464650388	Fix a typo. I was locked out for two days from my machine.	2004-04-10 14:36:57 +00:00
Tim J. Robbins	fa02ee78c8	Don't cast away const qualifiers. Spotted by: bde	2004-04-10 00:27:52 +00:00
Tim J. Robbins	8b8109275c	Update manual pages for change to C99 mbrtowc() semantics.	2004-04-08 09:59:02 +00:00
Tim J. Robbins	ca2dae426e	Allow partial multibyte characters to accumulate in conversion state objects passed to mbrtowc(), mbsrtowcs(), and mbrlen(), as required by C99.	2004-04-07 10:48:19 +00:00
Tim J. Robbins	e97e856274	Begin conversions for sgetrune() and sputrune() in the initial conversion state.	2004-04-07 09:49:10 +00:00
Tim J. Robbins	dc763237da	Prepare to handle state-dependent encodings. This mainly involves not taking shortcuts when it comes to storing and passing around conversion states.	2004-04-07 09:47:56 +00:00
Tim J. Robbins	ed870c6a8e	Begin in the initial shift state in mbstowcs() and wcstombs(). (This change is non-functional since nothing uses states yet.)	2004-04-07 08:33:23 +00:00
Tim J. Robbins	74f90def09	Prepare to handle state-dependent encodings. This mainly involves not taking shortcuts when it comes to storing and passing around conversion states.	2004-04-06 13:14:03 +00:00
Tim J. Robbins	4fb9e805dc	Remove support for emulating mbrtowc() and wcrtomb() in terms of the old rune interface now that it is no longer needed.	2004-04-04 11:31:29 +00:00
Tim J. Robbins	4f6d4aa30d	Reimplement the GB18030 encoding method using the new-style (mbrtowc()/ wcrtomb()) interface.	2004-04-04 11:00:42 +00:00
Tim J. Robbins	54c61797df	Reimplement the deprecated UTF2 encoding method using the UTF-8 code as a base. mbrtowc() and wcrtomb() are now implemented directly instead of being emulatedi with sgetrune() and sputrune().	2004-04-04 10:49:45 +00:00
Tim J. Robbins	6de4bcc717	Add cross-references to isideogram(3), isphonogram(3), isrune(3), isspecial(3) and wctype(3).	2004-03-30 08:11:57 +00:00
Tim J. Robbins	32d9553d83	Add basic manual pages for isideogram(), isphonogram(), isrune() and isspecial().	2004-03-30 07:23:54 +00:00
Tim J. Robbins	bee1de57ca	Trim cross-references.	2004-03-30 07:19:35 +00:00
Tim J. Robbins	ba6699086d	Document the isnumber() and ishexnumber() functions, and explain how they differ (at least in theory) from isdigit() and isxdigit().	2004-03-30 07:02:04 +00:00
Tim J. Robbins	ab02b93f75	Remove duplicate MLINK.	2004-03-29 21:46:52 +00:00
Tim J. Robbins	97062607cd	Recognize the "rune" character class in wctype().	2004-03-27 08:59:21 +00:00
Diomidis Spinellis	3f0a01ea87	Make consistent with the better written wcsrtombs function: - Fix syntax - Remove the (slightly wrong) duplicate explanation of the error condition - Change reference to invalid multibyte character into invalid wide character	2004-02-27 15:03:22 +00:00
Andrey A. Chernov	41ddc53bca	LC_ALL not always take priority over other LC_* Obtained from: NetBSD PR: 62047	2004-01-31 19:15:32 +00:00
Andrey A. Chernov	e6e9fb749a	Add reference to environ(7)	2004-01-29 09:27:24 +00:00
Jacques Vidrine	84d9142f58	Remove unused variables and function declarations. Add missing headers.	2004-01-06 18:26:15 +00:00
Andrey A. Chernov	ad4688e131	Properly advance "x/y/z" form slash-pointers in some rare cases PR: 60539	2003-12-24 10:16:46 +00:00
Andrey A. Chernov	6abda1f093	First byte of GBK-like sequences is 0x81, not 0x80	2003-12-19 12:54:42 +00:00
Tim J. Robbins	40c5c1f8a1	Set __mbrtowc and __wcrtomb correctly when changing to the C/POSIX locale. Save __mbrtowc and __wcrtomb and restore them when changing back to the cached locale. Reported by: perky	2003-12-08 23:52:22 +00:00
Tim J. Robbins	bc0b3a1800	Split multibyte(3) into separate manual pages for each function. Instead of just deleting it, turn the original page into a general overview of the multibyte character conversion functions, somewhat similar to stdio(3).	2003-12-07 06:33:52 +00:00
Tim J. Robbins	da44487bd7	Split the documentation for localeconv() off into a separate manual page.	2003-12-07 06:00:00 +00:00
Tim J. Robbins	8962b7a518	Update cross references after utf2/euc move.	2003-11-15 02:26:04 +00:00
Tim J. Robbins	f76c65296c	Remove section 4 versions of these manual pages, they have been moved into section 5.	2003-11-15 02:15:25 +00:00
Tim J. Robbins	93584b12e6	Install the section 5 versions of EUC and UTF2 manual pages instead of the section 4 versions.	2003-11-15 02:13:09 +00:00
Tim J. Robbins	ee0694adb9	Update the EUC and UTF2 manual pages for their new home in section 5. These have been repo-copied from euc.4 and utf2.4.	2003-11-15 01:54:46 +00:00
Tim J. Robbins	b1c572ad5b	Fix a typo that caused mbrtowc() to always return 0.	2003-11-11 07:25:05 +00:00
Tim J. Robbins	cc7a3285a5	Add one more cross-reference to gb2312(5).	2003-11-08 03:23:11 +00:00
Tim J. Robbins	16854d3c8f	Add cross-references to new gb2312(5) manual page.	2003-11-08 03:07:56 +00:00
Tim J. Robbins	e31d6d8149	Add a fairly simple manual page for the new GB2312 encoding.	2003-11-08 03:02:45 +00:00

1 2 3 4 5 ...

422 Commits