return -1 regardless of what s points to, mbtowc(&w, s, 1) sets w to a
null wide character when s points to a null byte. This seems to be closer
to what most other implementations do, but the C99 standard contradicts
itself for these cases.
"UTF2" method. Although UTF-8 and the old UTF2 encoding are compatible
for 16-bit characters, the new UTF-8 implementation is much more strict
about rejecting malformed input and also handles the full 31 bit range
of characters.
international monetary values: int_p_cs_precedes, int_n_cs_precedes,
int_p_sep_by_space, int_n_sep_by_space, int_p_sign_posn, int_n_sign_posn.
This should not break existing binaries or LC_MONETARY data files.
Reviewed by: ache
MFC after: 1 month
get applications to move to the ISO C interfaces as well as have the
freedom to replace the rune interfaces with ones that support stateful
conversions some time in the future.
called <machine/_types.h>.
o <machine/ansi.h> will continue to live so it can define MD clock
macros, which are only MD because of gratuitous differences between
architectures.
o Change all headers to make use of this. This mainly involves
changing:
#ifdef _BSD_FOO_T_
typedef _BSD_FOO_T_ foo_t;
#undef _BSD_FOO_T_
#endif
to:
#ifndef _FOO_T_DECLARED
typedef __foo_t foo_t;
#define _FOO_T_DECLARED
#endif
Concept by: bde
Reviewed by: jake, obrien
currently cached data. It allows a number of nice things, like: removing
fallback code from single locale loading, remove memory leak when LC_CTYPE
data loaded again and again, efficient cache use, not only for
setlocale(locale1); setlocale(locale1), but for setlocale(locale1);
setlocale("C"); setlocale(locale1) too (i.e. data file loaded only once).
towlower() and towupper() required by ISO C90 Amd. 1.
iswascii(), iswhexnumber(), iswideogram(), iswnumber(), iswphonogram(),
iswrune() and iswspecial() have also been implemented for consistency
with the BSD extensions in <ctype.h>.
2) Move incomplete check for / in locale name from env section to
loadlocale(), add check for "." and ".." too.
It allows to check any argument, not env only.
3) Redesing LOAD_CATEGORY macro to eliminate code duplication.
4) Try harder in fallback code: if old locale can't be restored,
load "C" locale
5) White space formatting, long lines, etc.
Satoshi NIIMI-san kindly explained that EUC does not limit the byte length to
any arbitrary number.
We now set the limit to the maximum octet length of the codeset and it is
locale-specific.
Submitted by: Yong-Jhen Hong <winard@ms11.url.com.tw>
Also, make an internal _getprogname() that is used only inside
libc. For libc, getprogname(3) is a weak symbol in case a
function of the same name is defined in userland.
deprecated in favor of the POSIX-defined lowercase variants.
o Change all occurrences of NTOHL() and associated marcros in the
source tree to use the lowercase function variants.
o Add missing license bits to sparc64's <machine/endian.h>.
Approved by: jake
o Clean up <machine/endian.h> files.
o Remove unused __uint16_swap_uint32() from i386's <machine/endian.h>.
o Remove prototypes for non-existent bswapXX() functions.
o Include <machine/endian.h> in <arpa/inet.h> to define the
POSIX-required ntohl() family of functions.
o Do similar things to expose the ntohl() family in libstand, <netinet/in.h>,
and <sys/param.h>.
o Prepend underscores to the ntohl() family to help deal with
complexities associated with having MD (asm and inline) versions, and
having to prevent exposure of these functions in other headers that
happen to make use of endian-specific defines.
o Create weak aliases to the canonical function name to help deal with
third-party software forgetting to include an appropriate header.
o Remove some now unneeded pollution from <sys/types.h>.
o Add missing <arpa/inet.h> includes in userland.
Tested on: alpha, i386
Reviewed by: bde, jake, tmm
is correct but less than useful. There is some uncertainty about whether
isblank() is in C99, but it is certainly not in C90. It just conforms
to C89 because it is a conforming extension.
1. ctype.h defines digittoint(), isnumber() and ishexnmber(), yet
they are not documented in any of the manpages.
2. The ctype manpage references a non-existent manpage for
digittoint().
3. The isascii() manpage claims it is standards compliant, when
it isn't.
4. isblank() claims it is _not_ standards compliant, when it
is.
Fix by including the appropriate .Nm entries, and with a new digittoint.3
page.
PR: docs/26451
Submitted by: Adrian Filipi-Martin <adrian@ubergeeks.com>
LC_MESSAGES related data was installed to <locale>/LC_MESSAGES file.
Now it go to <locale>/LC_MESSAGES/SYS_LC_MESSAGES file. LC_MESSAGES
directory is supposed to be storage of message catalogs of userland tools.
This should allow us to avoid many potential problems with future
libintl related functionality introduction.
Thanks for useful suggestions about correct way how to replace plain
files with directories at installworld stage to: Ruslan Ermilov <ru>
Avoid using parenthesis enclosure macros (.Pq and .Po/.Pc) with plain text.
Not only this slows down the mdoc(7) processing significantly, but it also
has an undesired (in this case) effect of disabling hyphenation within the
entire enclosed block.
The below text is quoted from the latest POSIX draft:
: The values of locale categories shall be determined by a precedence
: order; the first condition met below determines the value:
:
: 1. If the LC_ALL environment variable is defined and is not null,
: the value of LC_ALL shall be used.
: 2. If the LC_* environment variable (LC_COLLATE, LC_CTYPE, LC_MESSAGES,
: LC_MONETARY, LC_NUMERIC, LC_TIME) is defined and is not null, the
: value of the environment variable shall be used to initialize the
: category that corresponds to the environment variable.
: 3. If the LANG environment variable is defined and is not null, the
: value of the LANG environment variable shall be used.
: 4. If the LANG environment variable is not set or is set to the empty
: string, the implementation-defined default locale shall be used.
The conditions 1 and 2 were interchanged, i.e., LC_* were looked first,
then LC_ALL, then LANG (note that LC_ALL and LANG were essentially the
same, providing the default, with LC_ALL taking precedence over LANG).
Now, LC_ALL and LANG serve the different purposes. LC_ALL overrides
any LC_*, and LANG provides the default fallback.
Testcase:
/usr/bin/env LC_ALL=C LC_TIME=de_DE.ISO_8859-1 /bin/date
Should return date in the "C" locale format.
Inspired by: date(1) reference page in the Draft
LC_NUMERIC::grouping) values.
. Always set __XXX_changed flags then loading numeric & monetary locale
categories to allow localeconv() to use C locale also.
LC_NUMERIC fields, but only for *grouping fields - other fields are converted
to a chars in localeconv(), so final change is:
"-1" -> "127"
127 here is because CHAR_MAX supposed, which is _positive_ (SUSv2 requirement),
not negative as 255. It is still a bit of hack. To find real CHAR_MAX will be
better to sprintf() it once somewhere in static buffer. *grouping parsing
still broken and missing and needs to be implemented.
LC_MONETARY, LC_NUMERIC are byte-arrays, not ASCII strings!
Fix "C" locale, change "-1" to {CHAR_MAX, '\0'} according to standards.
This is only partial fix - locale loading procedure remains broken as before
and load too big values for all locales. All numeric strings there should be
converted with something like atoi() and placed into bytes. Maybe I do it
later, if someone will not fix it faster.
adding (weak definitions to) stubs for some of the pthread
functions. If the threads library is linked in, the real
pthread functions will pulled in.
Use the following convention for system calls wrapped by the
threads library:
__sys_foo - actual system call
_foo - weak definition to __sys_foo
foo - weak definition to __sys_foo
Change all libc uses of system calls wrapped by the threads
library from foo to _foo. In order to define the prototypes
for _foo(), we introduce namespace.h and un-namespace.h
(suggested by bde). All files that need to reference these
system calls, should include namespace.h before any standard
includes, then include un-namespace.h after the standard
includes and before any local includes. <db.h> is an exception
and shouldn't be included in between namespace.h and
un-namespace.h namespace.h will define foo to _foo, and
un-namespace.h will undefine foo.
Try to eliminate some of the recursive calls to MT-safe
functions in libc/stdio in preparation for adding a mutex
to FILE. We have recursive mutexes, but would like to avoid
using them if possible.
Remove uneeded includes of <errno.h> from a few files.
Add $FreeBSD$ to a few files in order to pass commitprep.
Approved by: -arch
be used to point to a bad locale file. This is only believed to be a
minor security risk - the only risk is if some program uses the result
of a localized string as a format specifier in a vulnerable function
like sprintf(). No such code is believed to exist in the FreeBSD base
system, although it is possible that badly written third party code
would do that.
Submitted by: imp
Approved by: ache
of the C++ stdlib. Our ctype.h uses symbols of the form _<X> to denote the
various character classes. Our ctype.h also extends the usual ctype.h
offering by adding the "_T" (special) class. Problem is parts of the STL
also use the symbol "_T" as its parameterized type. These two uses are
incompatible.
Thus change the form of the symbols used in ctype to something that fixes
the current problem and is less likely to cause conflicts in the future.
Requested by: Tomoaki NISHIYAMA <tomoaki@biol.s.u-tokyo.ac.jp>
Ok'ed by: JKH
just use _foo() <-- foo(). In the case of a libpthread that doesn't do
call conversion (such as linuxthreads and our upcoming libpthread), this
is adequate. In the case of libc_r, we still need three names, which are
now _thread_sys_foo() <-- _foo() <-- foo().
Convert all internal libc usage of: aio_suspend(), close(), fsync(), msync(),
nanosleep(), open(), fcntl(), read(), and write() to _foo() instead of foo().
Remove all internal libc usage of: creat(), pause(), sleep(), system(),
tcdrain(), wait(), and waitpid().
Make thread cancellation fully POSIX-compliant.
Suggested by: deischen
points. For library functions, the pattern is __sleep() <--
_libc_sleep() <-- sleep(). The arrows represent weak aliases. For
system calls, the pattern is _read() <-- _libc_read() <-- read().