Commit Graph

15 Commits

Author SHA1 Message Date
David Schultz
d7af8cf14b Previously, vfscanf()'s wide character processing functions were
reading wide characters manually.  With this change, they now use
fgetwc().  To make this work, we use an internal version of fgetwc()
with a few extensions: it takes an mbstate * because non-wide streams
don't have a built-in mbstate, and it indicates the number of bytes
read.

vfscanf() now resembles vfwscanf() more closely.  Minor functional
improvements include working xlocale support in vfscanf(), setting the
stream error indicator on encoding errors, and proper handling of
shift-based encodings.  (Actually, making shift-based encodings work
with non-wide streams is hopeless, but the implementation now matches
the broken specification.)
2012-04-29 16:28:39 +00:00
David Chisnall
3c87aa1d3d Implement xlocale APIs from Darwin, mainly for use by libc++. This adds a
load of _l suffixed versions of various standard library functions that use
the global locale, making them take an explicit locale parameter.  Also
adds support for per-thread locales.  This work was funded by the FreeBSD
Foundation.

Please test any code you have that uses the C standard locale functions!

Reviewed by:    das (gdtoa changes)
Approved by:    dim (mentor)
2011-11-20 14:45:42 +00:00
John Baldwin
1e98f88776 Next stage of stdio cleanup: Retire __sFILEX and merge the fields back into
__sFILE.  This was supposed to be done in 6.0.  Some notes:
- Where possible I restored the various lines to their pre-__sFILEX state.
- Retire INITEXTRA() and just initialize the wchar bits (orientation and
  mbstate) explicitly instead.  The various places that used INITEXTRA
  didn't need the locking fields or _up initialized.  (Some places needed
  _up to exist and not be off the end of a NULL or garbage pointer, but
  they didn't require it to be initialized to a specific value.)
- For now, stdio.h "knows" that pthread_t is a 'struct pthread *' to
  avoid namespace pollution of including all the pthread types in stdio.h.
  Once we remove all the inlines and make __sFILE private it can go back
  to using pthread_t, etc.
- This does not remove any of the inlines currently and does not change
  any of the public ABI of 'FILE'.

MFC after:	1 month
Reviewed by:	peter
2008-04-17 22:17:54 +00:00
Tim J. Robbins
f9ceea9bf1 Call __mbrtowc() and __wcrtomb() directly instead of taking detours
through mbrtowc() and wcrtomb().
2004-07-20 08:27:27 +00:00
Tim J. Robbins
fcc5191787 Slightly reorganize and simplify. 2004-07-09 15:12:10 +00:00
Tim J. Robbins
d6ed810a67 Perform conversions straight from the stream buffer instead of scanning
through byte by byte with mbrtowc(). In the usual case (buffer is big
enough to contain the multibyte character, character does not straddle
buffer boundary) this results in only one call to mbrtowc() for each
wide character read.
2004-05-22 15:41:03 +00:00
Tim J. Robbins
87275e436a Associate a multibyte conversion state object with each stream. Reset it
to the initial state when a stream is opened or seeked upon. Use the
stream's conversion state object instead of a freshly-zeroed one in
fgetwc(), fputwc() and ungetwc().

This is only a performance improvement for now, but it would also be
required in order to support state-dependent encodings.
2004-05-22 15:19:41 +00:00
Tim J. Robbins
93996f6d58 Prepare to handle trivial state-dependent encodings. Full support for
state-dependent encodings with locking shifts will come later if there
is demand for it.
2004-04-07 09:55:05 +00:00
Tim J. Robbins
a27a4b3690 Pass mbrtowc() and wcrtomb() NULL instead of a pointer to a freshly zeroed
mbstate_t object that they ignore. The zeroing is fairly expensive, and it
will never be necessary in these functions; when we support state-dependent
encodings, we will pass in a pointer to the file's mbstate_t object, and
only zero it at the time the file gets opened.
2003-11-04 11:05:55 +00:00
Tim J. Robbins
6180233fd8 Set the error bit on the stream if an encoding error occurs. Improve
handling of multibyte sequences representing null wide characters.
2002-10-16 12:09:43 +00:00
Tim J. Robbins
8f030a44b8 Introduce unlocked versions of fputwc() and fgetwc() called __fputwc()
and __fgetwc() which can be used when we know the file is locked.
2002-09-20 13:20:41 +00:00
Tim J. Robbins
0b7bc80226 Optimise the common case where no special encoding is in use (LC_CTYPE is "C"
or "POSIX", other European locales). Use __sgetc() and __sputc() where
possible to avoid a wasteful lock and unlock for each byte and to avoid
function call overhead.
2002-09-18 12:17:28 +00:00
Tim J. Robbins
bddc6280f2 Logic error in previous: don't exit the loop when an incomplete multibyte
sequence is detected.
2002-09-18 10:21:41 +00:00
Tim J. Robbins
24990dfad0 Reimplement the functionality of fgetrune(), fputrune(), and fungetrune()
here in terms of mbrtowc(), wcrtomb(), and the single-byte I/O functions.
The rune I/O functions are about to become deprecated in favour of the
ones provided by ISO C90 Amd. 1 and C99.
2002-09-18 05:58:11 +00:00
Tim J. Robbins
e74101e4ef Basic support for wide character I/O: getwc(), fgetwc(), getwchar(),
putwc(), fputwc(), putwchar(), ungetwc(), fwide().
2002-08-13 09:30:41 +00:00