The below text is quoted from the latest POSIX draft:
: The values of locale categories shall be determined by a precedence
: order; the first condition met below determines the value:
:
: 1. If the LC_ALL environment variable is defined and is not null,
: the value of LC_ALL shall be used.
: 2. If the LC_* environment variable (LC_COLLATE, LC_CTYPE, LC_MESSAGES,
: LC_MONETARY, LC_NUMERIC, LC_TIME) is defined and is not null, the
: value of the environment variable shall be used to initialize the
: category that corresponds to the environment variable.
: 3. If the LANG environment variable is defined and is not null, the
: value of the LANG environment variable shall be used.
: 4. If the LANG environment variable is not set or is set to the empty
: string, the implementation-defined default locale shall be used.
The conditions 1 and 2 were interchanged, i.e., LC_* were looked first,
then LC_ALL, then LANG (note that LC_ALL and LANG were essentially the
same, providing the default, with LC_ALL taking precedence over LANG).
Now, LC_ALL and LANG serve the different purposes. LC_ALL overrides
any LC_*, and LANG provides the default fallback.
Testcase:
/usr/bin/env LC_ALL=C LC_TIME=de_DE.ISO_8859-1 /bin/date
Should return date in the "C" locale format.
Inspired by: date(1) reference page in the Draft
LC_NUMERIC::grouping) values.
. Always set __XXX_changed flags then loading numeric & monetary locale
categories to allow localeconv() to use C locale also.
LC_NUMERIC fields, but only for *grouping fields - other fields are converted
to a chars in localeconv(), so final change is:
"-1" -> "127"
127 here is because CHAR_MAX supposed, which is _positive_ (SUSv2 requirement),
not negative as 255. It is still a bit of hack. To find real CHAR_MAX will be
better to sprintf() it once somewhere in static buffer. *grouping parsing
still broken and missing and needs to be implemented.
LC_MONETARY, LC_NUMERIC are byte-arrays, not ASCII strings!
Fix "C" locale, change "-1" to {CHAR_MAX, '\0'} according to standards.
This is only partial fix - locale loading procedure remains broken as before
and load too big values for all locales. All numeric strings there should be
converted with something like atoi() and placed into bytes. Maybe I do it
later, if someone will not fix it faster.
adding (weak definitions to) stubs for some of the pthread
functions. If the threads library is linked in, the real
pthread functions will pulled in.
Use the following convention for system calls wrapped by the
threads library:
__sys_foo - actual system call
_foo - weak definition to __sys_foo
foo - weak definition to __sys_foo
Change all libc uses of system calls wrapped by the threads
library from foo to _foo. In order to define the prototypes
for _foo(), we introduce namespace.h and un-namespace.h
(suggested by bde). All files that need to reference these
system calls, should include namespace.h before any standard
includes, then include un-namespace.h after the standard
includes and before any local includes. <db.h> is an exception
and shouldn't be included in between namespace.h and
un-namespace.h namespace.h will define foo to _foo, and
un-namespace.h will undefine foo.
Try to eliminate some of the recursive calls to MT-safe
functions in libc/stdio in preparation for adding a mutex
to FILE. We have recursive mutexes, but would like to avoid
using them if possible.
Remove uneeded includes of <errno.h> from a few files.
Add $FreeBSD$ to a few files in order to pass commitprep.
Approved by: -arch
be used to point to a bad locale file. This is only believed to be a
minor security risk - the only risk is if some program uses the result
of a localized string as a format specifier in a vulnerable function
like sprintf(). No such code is believed to exist in the FreeBSD base
system, although it is possible that badly written third party code
would do that.
Submitted by: imp
Approved by: ache
of the C++ stdlib. Our ctype.h uses symbols of the form _<X> to denote the
various character classes. Our ctype.h also extends the usual ctype.h
offering by adding the "_T" (special) class. Problem is parts of the STL
also use the symbol "_T" as its parameterized type. These two uses are
incompatible.
Thus change the form of the symbols used in ctype to something that fixes
the current problem and is less likely to cause conflicts in the future.
Requested by: Tomoaki NISHIYAMA <tomoaki@biol.s.u-tokyo.ac.jp>
Ok'ed by: JKH
just use _foo() <-- foo(). In the case of a libpthread that doesn't do
call conversion (such as linuxthreads and our upcoming libpthread), this
is adequate. In the case of libc_r, we still need three names, which are
now _thread_sys_foo() <-- _foo() <-- foo().
Convert all internal libc usage of: aio_suspend(), close(), fsync(), msync(),
nanosleep(), open(), fcntl(), read(), and write() to _foo() instead of foo().
Remove all internal libc usage of: creat(), pause(), sleep(), system(),
tcdrain(), wait(), and waitpid().
Make thread cancellation fully POSIX-compliant.
Suggested by: deischen
points. For library functions, the pattern is __sleep() <--
_libc_sleep() <-- sleep(). The arrows represent weak aliases. For
system calls, the pattern is _read() <-- _libc_read() <-- read().
else, it is equivalent to strdup(). So, we will check if the substitution
tables are trivial at the load time, and possibly save 2 calls to
__collate_substitute() in strcoll().
Still, __collate_substitute() should not exist.
Other minor optimizations. I got ~30% speedup in strcoll() for 50 char strings,
~40% speedup for 100 char strings, and unmeasurable speedup for 1M strings.
Collates are still terribly slow. To make them reasonable fast,
__collate_substitute() should be killed.
track.
The $Id$ line is normally at the bottom of the main comment block in the
man page, separated from the rest of the manpage by an empty comment,
like so;
.\" $Id$
.\"
If the immediately preceding comment is a @(#) format ID marker than the
the $Id$ will line up underneath it with no intervening blank lines.
Otherwise, an additional blank line is inserted.
Approved by: bde
Submitted by: Yung-Jen Hung <winard@u3717a.dorm.ccu.edu.tw>
Reviewed by: bearscorp.bbs@bbs.life.nthu.edu.tw
_BIG5_sgetrune() in libc doesn't work well, this commit will fix it.
o use braces to avoid potentially ambiguous else
o don't default to type int (and also remove a useless register
modifier).
o Use parens around assignment values used as truth values.
o Remove unused function.
Reviewed by: obrien and chuckr
In some cases replace if (a == null) a = malloc(x); else a =
realloc(a, x); with simple reallocf(a, x). Per ANSI-C, this is
guaranteed to be the same thing.
I've been running these on my system here w/o ill effects for some
time. However, the CTM-express is at part 6 of 34 for the CAM
changes, so I've not been able to do a build world with the CAM in the
tree with these changes. Shouldn't impact anything, but...
the diff is attached below. This is done on the 3.0 source-tree.
I have test this on 2.2-stable before, but I don't have a 3.0 machine
right now.
This patch is mainly to make libc support BIG5 encoding, thus add
zh_TW.BIG5 locale to 3.0.
Submitted by: Chen Hsiung Chan <frankch@waru.life.nthu.edu.tw>
PR: 4555
Submitted by: Dmitrij Tejblum <tejblum@arc.hq.cti.ru>
[0x0400 - 0xffff] [bbbbbbbb.bbbbbbbb] -> 1110bbbb, 10bbbbbb, 10bbbbbb
.Ed
.Pp
If more than a single representation of a value exists (for example,
0x00; 0xC0 0x80; 0xE0 0x80 0x80) the shortest representation is always
used (but the longer ones will be correctly decoded).
.Pp
The final three encodings provided by X-Open:
.Bd -literal
[00000000.000bbbbb.bbbbbbbb.bbbbbbbb] ->
11110bbb, 10bbbbbb, 10bbbbbb, 10bbbbbb
[000000bb.bbbbbbbb.bbbbbbbb.bbbbbbbb] ->
111110bb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb
[0bbbbbbb.bbbbbbbb.bbbbbbbb.bbbbbbbb] ->
1111110b, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb, 10bbbbbb
.Ed
.Pp
which provides for the entire proposed ISO-10646 31 bit standard are currently
not implemented.
.Sh "SEE ALSO"
.Xr mklocale 1 ,
.Xr setlocale 3
@
1.4
log
@Don't use hardcoded *roff font change requests. Do it
via mdoc macros instead.
@
text
@d37 1
a37 1
.Dd "June 4, 1993"
@
1.3
log
@Very minor mdoc cleanup.
@
text
@d44 2
a45 1
\fBENCODING "UTF2"\fP
@
1.2
log
@Another round of various man page cleanups.
@
text
@d65 1
a65 1
.sp
d81 1
a81 1
.sp
@
1.2.2.1
log
@YAMFC:
Commit all of the -current changes that apply to 2.2. These fall into
several categories:
- Cosmetic/mdoc changes. They don't really afect the output
at all, but having them in 2.2 will make it easier to diff the man
pages later when looking for real changes.
- Update some man pages to reflect the current 2.2 header files.
- Sort xrefs.
- A few typo fixes.
- And a few changes that actualy added text to the man page that should
be reflected in 2.2.
- Add some missing MLINKS.
Requested by: bde
@
text
@d44 1
a44 2
.Nm ENCODING
.Qq UTF2
d65 1
a65 1
.Pp
d81 1
a81 1
.Pp
@
1.2.2.2
log
@MFC: Just the locale fixes (small doc tweaks for the most part)
and the new strptime(3) call. Having added something, does this
require a version bump? Haven't we bumped once already?
There are a *LOT* of additional 3.0 changes to be merged but I'm not
entirely comfortable with some of them so I'll take the conservative
(read: cowardly :) way out and just merge this much.
@
text
@d37 1
a37 1
.Dd June 4, 1993
@
1.1
log
@Initial revision
@
text
@d41 1
a41 1
.Nm UTF2
@
1.1.1.1
log
@BSD 4.4 Lite Lib Sources
@
text
@@
1.1.1.1.6.1
log
@Phase 2 of merge - also fix things broken in phase 1.
Watch out for falling rock until phase 3 is over!
libc completely merged except for phkmalloc & rfork (don't know if David
wants that).
Some include files in sys/ had to be updated in order to bring in libc.
@
text
@d41 1
a41 1
.Nm utf2
@
1.1.1.1.6.2
log
@This 3rd mega-commit should hopefully bring us back to where we were.
I can get it to `make world' succesfully, anyway!
@
text
@d41 1
a41 1
.Nm UTF2
@
back as designed in *BSD
Also it not violates current standards but
1) No other Unixes have this feature
2) It broke Kerberos5 (isprint) and God knows what else
(not all vendors will agree to treat FreeBSD as special case for support
since (1))
2) Give false localization sense (programs mimic to be 8859-1
localized) which prevents true localization.
so that all these makefiles can be used to build libc_r too.
Added .if ${LIB} == "c" tests to restrict man page builds to libc
to avoid needlessly building them with libc_r too.
Split libc Makefile into Makefile and Makefile.inc to allow the
libc_r Makefile to include Makefile.inc too.
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.
Vulnerable: all programs that use setlocale(LC_COLLATE),
setlocale(LC_CTYPE), or setlocale(LC_ALL). The only setuid/setgid
binary i've found for this is w(1).
Should go into 2.2.
Since locale reading code not resistent against stack overflowing or
similar intruder attacks, don't allow PATH_LOCALE env variable action
for s-bit programs (non-standard locale path setting)
strdup() it to prevent unsetenv() or setenv() effects. Check its length to
not allow user to overflow internal locale buffer. Move PATH_LOCALE
handling code into one place.
POSIX: make better stub for LC_MONETARY & LC_NUMERIC, now it check
locale directory existance instead of refusing all non-C non-POSIX
locales. POSIX treats empty locale env variable as unset variable
while our old code treats it as "C" locale, fix it. Implement previous locale
restoring, if locale setting fails. Old code assumes success if some
of LC_ALL subset is successed even other fails, POSIX treats it as
failure with previous locale restoring, fix it.
Remove unneccessary length checking in currentlocale()
inside libc. Add collate_range_cmp as alias to __collate_range_cmp
for temp. backward compatibility.
collate_range_cmp will be replaced with direct code for each
external program for compatibility with the rest of world
for gcc >= 2.5 and no-ops for gcc >= 2.6. Converted to use __dead2
or __pure2 where it wasn't already done, except in math.h where use
of __pure was mostly wrong.
in a bunch of man pages.
Use the correct .Bx (BSD UNIX) or .At (AT&T UNIX) macros
instead of explicitly specifying the version in the text
in a bunch of man pages.
this man page to prevent half of it from coming out with underlines.
This man page needs to be gone over to fully convert it to mdoc format.
This closes PR#1440.
Submitted by: Jens Schweikhardt <schweikhardt@rus.uni-stuttgart.de>
If _ANSI_SOURCE or _POSIX_SOURCE is defined, then <ctype.h> had to
be included before <stddef.h> or <stdlib.h> to get rune_t declared.
Now rune_t is declared perfectly bogusly in all cases when <ctype.h>
is included.
This change breaks similar (but more convoluted) convolutions in the
stddef.h in gcc distributions. Ports of gcc should avoid using the
gcc headers.
by me). This probably loses for multibyte characters, but I have no
way of telling. I'll let ache decide whether to add this support to
startup_setlocale. Note that for this to make any sense at all, the
symlinks in /usr/share/locale must go. (For the moment, this doesn't
make any difference since there are no locales supplied.)
Obtained from: Arthur David Olson <ado@elsie.nci.nih.gov>
isctype.c:
o The tolower() and toupper() functions duplicated too much code
and were out of date (surprise). This didn't matter because
it was difficult to call them.
o Change formatting to be more like that in <ctype.h> (with
extra parentheses as in the macros). Perhaps this file should
be machine generated or everything should be handled like
__tolower() so that no code is repeated.
nomacros.c:
o Instead of looking at _USE_CTYPE_INLINE_ to see what <ctype.h>
has done, set _EXTERNALIZE_CTYPE_INLINES_ to tell <ctype.h>
what to do, so that we don't have anything left to do. Note
that code is now generated even if inlines are used by default.
This allows users to switch to non-inline versions.