Commit Graph

60 Commits

Author SHA1 Message Date
Thomas Munro
cc7edd258c Add collation version support to querylocale(3).
Provide a way to ask for an opaque version string for a locale_t, so
that potential changes in sort order can be detected.  Similar to
ICU's ucol_getVersion() and Windows' GetNLSVersionEx(), this API is
intended to allow databases to detect when text order-based indexes
might need to be rebuilt.

The CLDR version is extracted from CLDR source data by the Makefile
under tools/tools/locale, written into the machine-generated Makefile
under shared/colldef, passed to localedef -V, and then written into
LC_COLLATE file headers.  The initial version is 34.0.
tools/tools/locale was recently updated to pull down 35.0, but the
output hasn't been committed under share/colldef yet, so that will
provide the first observable change when it happens.  Other versioning
schemes are possible in future, because the format is unspecified.

Reviewed by:	bapt, 0mp, kib, yuripv (albeit a long time ago)
Differential Revision:	https://reviews.freebsd.org/D17166
2020-11-08 02:50:34 +00:00
Mark Johnston
57e642365b libc: Fix a few bugs in the xlocale collation code.
- Fix checks for mmap() failures. [1]
- Set the "map" and "maplen" fields of struct xlocale_collate so that
  the table destructor actually does something.
- Free an already-mapped collation file before loading a new one into
  the global table.
- Harmonize the prototype and definition of __collate_load_tables_l() by
  adding the "static" qualifier to the latter.

PR:		243195
Reported by:	cem [1]
Reviewed by:	cem, yuripv
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23109
2020-01-09 20:49:26 +00:00
Yuri Pankov
dd7c41a378 Add hybrid C.UTF-8 locale being identical to default C locale except
that it uses the same ctype maps and functions as other UTF-8 locales.

Reviewed by:	bapt, cem, eadler
Approved by:	kib (mentor, implicit)
Differential Revision:	https://reviews.freebsd.org/D17833
2018-11-04 22:13:22 +00:00
Yuri Pankov
4644f9bef6 Add -b/-l options to localedef(1) to specify output endianness and use
it appropriately when building share/ctypedef and share/colldef.

This makes the resulting locale data in EL->EB (amd64->powerpc64) cross
build and in the native EB build match.  Revert the changes done to libc
in r308170 as they are no longer needed.

PR:		231965
Reviewed by:	bapt, emaste, sbruno, 0mp
Approved by:	kib (mentor)
Differential Revision:	https://reviews.freebsd.org/D17603
2018-10-20 20:51:05 +00:00
Pedro F. Giffuni
d915a14ef0 libc: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using mis-identified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-25 17:12:48 +00:00
Bryan Drewery
dc8507e1f7 __setrunelocale: Fix asprintf(3) failure not returning an error.
Also fix the style of the asprintf(3) call in __collate_load_tables_l().
Both of these lines were modified away from snprintf(3) during the
import from DragonFly/Illumos.

Reviewed by:	jilles (briefly over shoulder)
MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-09-29 16:30:50 +00:00
Ruslan Bukin
77bc2a1cd6 Locale fix for endian big (EB) machines.
We have locale files generated on EL machines (e.g. during cross-build
on amd64 host), but then we are using them on EB machines (e.g. MIPS64EB),
so proceed byte-swap if necessary.

All the libc tests passed successfully, including Russian collation.

Tested by:	br@, Hongyan Xia <hx242@cam.ac.uk>
Sponsored by:	DARPA, AFRL
Sponsored by:	HEIF5
Differential Revision:	https://reviews.freebsd.org/D8281
2016-11-01 13:54:44 +00:00
Baptiste Daroussin
dee0bbbdca Revert 302324 and properly fix the crash with ISO-8859-5 locales
PR:		211135
Reported by:	jkim
Tested by:	jkim
MFC after:	2 days
2016-07-15 23:03:20 +00:00
Baptiste Daroussin
a8bacd6f93 Fix a bad test resulting in a segfault with ISO-8859-5 locales
Reported by:	Lauri Tirkkonen from Illumos
Approved by:	re@ (gjb)
2016-07-03 15:00:12 +00:00
Pedro F. Giffuni
32223c1b7d libc: spelling fixes.
Mostly on comments.
2016-04-30 01:24:24 +00:00
Baptiste Daroussin
76e6db686e collate: Fix expansion substitions (broken upstream too)
Through testing, the user noted that some Cyrillic characters were not
sorting correctly, and this was confirmed.

After extensive testing and review, the localedef tool was eliminated
as the culprit.  The sustitutions were encoded correctly in LC_COLLATE.

The error was mainly in wcscoll where character expansions were
mishandled.  The main directive pass routines had to be written to
go back for a new collation value when the "state" variable was set.
Before pointers were being advanced, the second lookup was gettting
applied to the wrong character, etc.

The "eat expansion codes" section on collate.c also had a bug.  Later
own, the "state" variable logic was changed to only set if next
code was greater than zero (rather than >= 0).

Some additional cleanups got captured from previous work:
1) The previous commit moved the binary search comment from the
   correct location to a wrong location because it's wrong upstream
   in Illumos.  The comment has little value so I just removed it.
2) Don't check if pointers are null before freeing, this is
   redundant as free() handles null pointers.
3) The two binary search trees were standardized wrt initialization
4) On the binary search trees, a negative "high" exits rather than
   checking the table count again.

Submitted by:	marino
Obtained from:	DragonflyBSD
2015-10-23 23:24:03 +00:00
Baptiste Daroussin
332fe83717 libc/collate: minor tweaks / fix
The main "fix" here is properly setting a collate loading error for each
early return.  Tweaks include removing unnecessary null checks, adding
assertions (from Illumos) and a couple of variables to reduces code
differences and improve readability.  For normal use, there are no
functional changes here.

Obtained from:	DragonflyBSD, Illumos
2015-10-22 14:29:19 +00:00
Baptiste Daroussin
c25f5140e9 Include sys/*.h earlier
Reported by:	kib
2015-10-14 12:46:05 +00:00
Baptiste Daroussin
eaa94ab419 Fix typo 2015-08-09 12:20:22 +00:00
Baptiste Daroussin
28a20bb3f5 Use more asprintf
Plug memory leak introduced in previous asprintf addition
2015-08-09 12:13:30 +00:00
Baptiste Daroussin
b89704cee7 Use asprintf/free instead of snprintf 2015-08-09 11:50:50 +00:00
Baptiste Daroussin
5e4bbc69de Remove useless variable 2015-08-09 11:47:01 +00:00
Baptiste Daroussin
a6d2922cbb Mark __collate_load_tables_l as static
Remove useless addition to Symbols.map
2015-08-09 10:24:24 +00:00
Baptiste Daroussin
536451f914 Fix typo
Fix bad location for include

Reported by:	jilles
2015-08-09 00:21:59 +00:00
Baptiste Daroussin
7b2473410f Revamp CTYPE support (from Illumos & Dragonfly)
Obtained from:	Dragonfly
2015-08-08 18:22:14 +00:00
Baptiste Daroussin
2a6abeebef The collate functions within libc have been using version 1 and 1.2 of the
packed LC_COLLATE binary formats. These were generated with the colldef
tool, but the new LC_COLLATE files are going to be generated by the new
localedef tool using CLDR POSIX files as input.  The BSD-flavored
version of localedef identifies the format as "BSD 1.0".  Any
LC_COLLATE file with a different version will simply not be loaded, and
all LC* categories will get set to "C" (aka "POSIX") locale.

This work is based off of Nexenta's contribution to Illumos.
The integration with xlocale is John Marino's work for Dragonfly.

The following commits will enable localedef tool, disable the colldef
tool, add generated colldef directory, and finally remove colldef from
base.

The only difference with Dragonfly are:
- a few fixes to build with clang
- And identification of the flavor as "BSD 1.0" instead of "Dragonfly 4.4"

Obtained from:	Dragonfly
2015-08-07 23:41:26 +00:00
Jilles Tjoelker
07588bf421 libc: Make various internal file descriptors close-on-exec.
These are obtained via fopen().
2012-12-11 22:52:56 +00:00
David Chisnall
bb4317bf3c Restore the __collate_load_error global that was accidentally removed in the
xlocale refactoring.

MFC after:	1 week
2012-07-06 20:16:22 +00:00
David Chisnall
7dfd88318b Remove some duplicated copyright notices.
Approved by:	dim (mentor)
2012-03-06 12:53:44 +00:00
David Chisnall
3c87aa1d3d Implement xlocale APIs from Darwin, mainly for use by libc++. This adds a
load of _l suffixed versions of various standard library functions that use
the global locale, making them take an explicit locale parameter.  Also
adds support for per-thread locales.  This work was funded by the FreeBSD
Foundation.

Please test any code you have that uses the C standard locale functions!

Reviewed by:    das (gdtoa changes)
Approved by:    dim (mentor)
2011-11-20 14:45:42 +00:00
Ruslan Ermilov
edc431123e Make the format of LC_COLLATE files architecture independent. 2005-02-27 20:31:13 +00:00
Alexey Zelkin
f9b5e461bb ANSI'fy prototypes 2005-02-27 14:54:23 +00:00
Stefan Farfeleder
e60b9f5130 Prefer C99's __func__ over GCC's __FUNCTION__. 2004-09-22 16:56:49 +00:00
Tim J. Robbins
a019c0e525 Remove unnecessary inclusion of <rune.h> to make it obvious that this file
does not use the deprecated rune system.
2002-10-29 09:03:57 +00:00
Andrey A. Chernov
c14170612e Use ntohl() to read cnains number in new format 2002-08-31 01:05:39 +00:00
Andrey A. Chernov
cbc98d0541 Style fix 2002-08-30 20:39:53 +00:00
Andrey A. Chernov
8e52da4dfc Prepare for switching to unlimited chains format.
Optimize chains lookup a bit.
2002-08-30 20:26:02 +00:00
Andrey A. Chernov
a2a26d0a3d Reduce BSS size for programs which not load collate by eliminating
static buffer.
2002-08-13 14:55:17 +00:00
Andrey A. Chernov
e34fe8a408 Now malloc() is fixed, remove errno hardcoding to ENOMEM 2002-08-12 17:14:04 +00:00
Andrey A. Chernov
76692b8025 Rewrite locale loading procedures, so any load failure will not affect
currently cached data.  It allows a number of nice things, like: removing
fallback code from single locale loading, remove memory leak when LC_CTYPE
data loaded again and again, efficient cache use, not only for
setlocale(locale1); setlocale(locale1), but for setlocale(locale1);
setlocale("C"); setlocale(locale1) too (i.e.  data file loaded only once).
2002-08-08 05:51:54 +00:00
Andrey A. Chernov
6892b144e8 Style fixes in preparation for rewritting 2002-08-07 18:02:45 +00:00
Mark Murray
4cd0119367 Do not use __progname directly (except in [gs]etprogname(3)).
Also, make an internal _getprogname() that is used only inside
libc. For libc, getprogname(3) is a weak symbol in case a
function of the same name is defined in userland.
2002-03-29 22:43:43 +00:00
David E. O'Brien
333fc21e3c Fix the style of the SCM ID's.
I believe have made all of libc .c's as consistent as possible.
2002-03-22 21:53:29 +00:00
Daniel Eischen
d201fe46e3 Remove _THREAD_SAFE and make libc thread-safe by default by
adding (weak definitions to) stubs for some of the pthread
functions.  If the threads library is linked in, the real
pthread functions will pulled in.

Use the following convention for system calls wrapped by the
threads library:
	__sys_foo - actual system call
	_foo - weak definition to __sys_foo
	foo - weak definition to __sys_foo

Change all libc uses of system calls wrapped by the threads
library from foo to _foo.  In order to define the prototypes
for _foo(), we introduce namespace.h and un-namespace.h
(suggested by bde).  All files that need to reference these
system calls, should include namespace.h before any standard
includes, then include un-namespace.h after the standard
includes and before any local includes.  <db.h> is an exception
and shouldn't be included in between namespace.h and
un-namespace.h  namespace.h will define foo to _foo, and
un-namespace.h will undefine foo.

Try to eliminate some of the recursive calls to MT-safe
functions in libc/stdio in preparation for adding a mutex
to FILE.  We have recursive mutexes, but would like to avoid
using them if possible.

Remove uneeded includes of <errno.h> from a few files.

Add $FreeBSD$ to a few files in order to pass commitprep.

Approved by:	-arch
2001-01-24 13:01:12 +00:00
Jason Evans
9233c4d942 Simplify sytem call renaming. Instead of _foo() <-- _libc_foo <-- foo(),
just use _foo() <-- foo().  In the case of a libpthread that doesn't do
call conversion (such as linuxthreads and our upcoming libpthread), this
is adequate.  In the case of libc_r, we still need three names, which are
now _thread_sys_foo() <-- _foo() <-- foo().

Convert all internal libc usage of: aio_suspend(), close(), fsync(), msync(),
nanosleep(), open(), fcntl(), read(), and write() to _foo() instead of foo().

Remove all internal libc usage of: creat(), pause(), sleep(), system(),
tcdrain(), wait(), and waitpid().

Make thread cancellation fully POSIX-compliant.

Suggested by:	deischen
2000-01-27 23:07:25 +00:00
Jason Evans
929273386f Add three-tier symbol naming in support of POSIX thread cancellation
points.  For library functions, the pattern is __sleep() <--
_libc_sleep() <-- sleep().  The arrows represent weak aliases.  For
system calls, the pattern is _read() <-- _libc_read() <-- read().
2000-01-12 09:23:48 +00:00
Dmitrij Tejblum
e755fb7671 __collate_substitute() do something non-trivial only for German. For everyone
else, it is equivalent to strdup(). So, we will check if  the substitution
tables are trivial at the load time, and possibly save 2 calls to
__collate_substitute() in strcoll().

Still, __collate_substitute() should not exist.
1999-09-12 21:15:28 +00:00
Dmitrij Tejblum
03a7efc234 Reduce time of __collate_substitute() from O(strlen(s)^2) to O(strlen(s)).
Other minor optimizations. I got ~30% speedup in strcoll() for 50 char strings,
~40% speedup for 100 char strings, and unmeasurable speedup for 1M strings.

Collates are still terribly slow. To make them reasonable fast,
__collate_substitute() should be killed.
1999-09-12 19:42:38 +00:00
Peter Wemm
7f3dea244c $Id$ -> $FreeBSD$ 1999-08-28 00:22:10 +00:00
Warner Losh
e8420087b0 Replace memory leaking instances of realloc with non-leaking reallocf.
In some cases replace if (a == null) a = malloc(x); else a =
realloc(a, x); with simple reallocf(a, x).  Per ANSI-C, this is
guaranteed to be the same thing.

I've been running these on my system here w/o ill effects for some
time.  However, the CTM-express is at part 6 of 34 for the CAM
changes, so I've not been able to do a build world with the CAM in the
tree with these changes.  Shouldn't impact anything, but...
1998-09-16 04:17:47 +00:00
Peter Wemm
7e546392b5 Revert $FreeBSD$ to $Id$ 1997-02-22 15:12:41 +00:00
Andrey A. Chernov
63407d3487 Use symbolic constants instead of hardcoded digits
Add range check for setrunelocale since it can be called
directly.
Remove _startup_setlocale compatibility function

Should go into 2.2
1997-02-06 09:11:06 +00:00
Andrey A. Chernov
d81a091605 Update the comment why range checking not needed
Fix setrunelocale fail if called directly without prior setlocale
call

Should go in 2.2
1997-02-05 19:17:10 +00:00
Jordan K. Hubbard
1130b656e5 Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
Andrey A. Chernov
af155bdff3 Add comment that range checking is already done at upper level
Kill snprintf left in collate.c from previous backout

Should go in 2.2
1996-12-28 05:04:24 +00:00