Commit Graph

26 Commits

Author SHA1 Message Date
Thomas Munro
cc7edd258c Add collation version support to querylocale(3).
Provide a way to ask for an opaque version string for a locale_t, so
that potential changes in sort order can be detected.  Similar to
ICU's ucol_getVersion() and Windows' GetNLSVersionEx(), this API is
intended to allow databases to detect when text order-based indexes
might need to be rebuilt.

The CLDR version is extracted from CLDR source data by the Makefile
under tools/tools/locale, written into the machine-generated Makefile
under shared/colldef, passed to localedef -V, and then written into
LC_COLLATE file headers.  The initial version is 34.0.
tools/tools/locale was recently updated to pull down 35.0, but the
output hasn't been committed under share/colldef yet, so that will
provide the first observable change when it happens.  Other versioning
schemes are possible in future, because the format is unspecified.

Reviewed by:	bapt, 0mp, kib, yuripv (albeit a long time ago)
Differential Revision:	https://reviews.freebsd.org/D17166
2020-11-08 02:50:34 +00:00
Pedro F. Giffuni
d915a14ef0 libc: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using mis-identified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
2017-11-25 17:12:48 +00:00
Andrey A. Chernov
12eae8c8f3 1) Eliminate possibility to call __*collate_range_cmp() with inclomplete
locale (which cause core dump) by removing whole 'table' argument
by which it passed.

2) Restore __collate_range_cmp() in __sccl().

3) Collating [a-z] range in regcomp() only for single bytes locales
(we can't do it now for other ones). In previous state only first 256
wchars are considered and all others are just silently dropped from the
range.
2016-07-14 09:07:25 +00:00
Andrey A. Chernov
1daad8f5ad Back out non-collating [a-z] ranges.
Instead of changing whole course to another POSIX-permitted way
for consistency and uniformity I decide to completely ignore missing
regex fucntionality and concentrace on fixing bugs in what we have now,
too many small obstacles instead, counting ports.
2016-07-14 08:18:12 +00:00
Andrey A. Chernov
5a5807dd4c Remove broken support for collation in [a-z] type ranges.
Only first 256 wide chars are considered currently, all other are just
dropped from the range. Proper implementation require reverse tables
database lookup, since objects are really big as max UTF-8 (1114112
code points), so just the same scanning as it was for 256 chars will
slow things down.

POSIX does not require collation for [a-z] type ranges and does not
prohibit it for non-POSIX locales. POSIX require collation for ranges
only for POSIX (or C) locale which is equal to ASCII and binary for
other chars, so we already have it.

No other *BSD implements collation for [a-z] type ranges.

Restore ABI compatibility with unused now __collate_range_cmp() which
is visible from outside (will be removed later).
2016-07-10 03:49:38 +00:00
Pedro F. Giffuni
3c2c0c0443 libc/locale: Fix type breakage in __collate_range_cmp().
When collation support was brought in, the second and third
arguments in __collate_range_cmp() were changed from int to
wchar_t, breaking the ABI. Change them to a "char" type which
makes more sense and keeps the ABI compatible.

Also introduce __wcollate_range_cmp() which does work with wide
characters. This function is used only internally in libc so
we don't export it. Use the new function in glob(3), fnmatch(3),
and regexec(3).

PR:		179721
Suggested by:	ache. jilles
MFC after:	3 weeks (perhaps partial only)
2016-06-05 19:12:52 +00:00
Baptiste Daroussin
536451f914 Fix typo
Fix bad location for include

Reported by:	jilles
2015-08-09 00:21:59 +00:00
Baptiste Daroussin
d0a68f8d38 Fix typo 2015-08-08 23:17:10 +00:00
Baptiste Daroussin
7b2473410f Revamp CTYPE support (from Illumos & Dragonfly)
Obtained from:	Dragonfly
2015-08-08 18:22:14 +00:00
Baptiste Daroussin
2a6abeebef The collate functions within libc have been using version 1 and 1.2 of the
packed LC_COLLATE binary formats. These were generated with the colldef
tool, but the new LC_COLLATE files are going to be generated by the new
localedef tool using CLDR POSIX files as input.  The BSD-flavored
version of localedef identifies the format as "BSD 1.0".  Any
LC_COLLATE file with a different version will simply not be loaded, and
all LC* categories will get set to "C" (aka "POSIX") locale.

This work is based off of Nexenta's contribution to Illumos.
The integration with xlocale is John Marino's work for Dragonfly.

The following commits will enable localedef tool, disable the colldef
tool, add generated colldef directory, and finally remove colldef from
base.

The only difference with Dragonfly are:
- a few fixes to build with clang
- And identification of the flavor as "BSD 1.0" instead of "Dragonfly 4.4"

Obtained from:	Dragonfly
2015-08-07 23:41:26 +00:00
David Chisnall
3c87aa1d3d Implement xlocale APIs from Darwin, mainly for use by libc++. This adds a
load of _l suffixed versions of various standard library functions that use
the global locale, making them take an explicit locale parameter.  Also
adds support for per-thread locales.  This work was funded by the FreeBSD
Foundation.

Please test any code you have that uses the C standard locale functions!

Reviewed by:    das (gdtoa changes)
Approved by:    dim (mentor)
2011-11-20 14:45:42 +00:00
Ruslan Ermilov
edc431123e Make the format of LC_COLLATE files architecture independent. 2005-02-27 20:31:13 +00:00
Andrey A. Chernov
8e52da4dfc Prepare for switching to unlimited chains format.
Optimize chains lookup a bit.
2002-08-30 20:26:02 +00:00
Andrey A. Chernov
a2a26d0a3d Reduce BSS size for programs which not load collate by eliminating
static buffer.
2002-08-13 14:55:17 +00:00
Andrey A. Chernov
76692b8025 Rewrite locale loading procedures, so any load failure will not affect
currently cached data.  It allows a number of nice things, like: removing
fallback code from single locale loading, remove memory leak when LC_CTYPE
data loaded again and again, efficient cache use, not only for
setlocale(locale1); setlocale(locale1), but for setlocale(locale1);
setlocale("C"); setlocale(locale1) too (i.e.  data file loaded only once).
2002-08-08 05:51:54 +00:00
David E. O'Brien
c05ac53b8b Remove __P() usage. 2002-03-21 22:49:10 +00:00
Alexey Zelkin
f43a321bf0 style(9)'ify 2001-12-20 18:28:52 +00:00
Dmitrij Tejblum
e755fb7671 __collate_substitute() do something non-trivial only for German. For everyone
else, it is equivalent to strdup(). So, we will check if  the substitution
tables are trivial at the load time, and possibly save 2 calls to
__collate_substitute() in strcoll().

Still, __collate_substitute() should not exist.
1999-09-12 21:15:28 +00:00
Peter Wemm
7f3dea244c $Id$ -> $FreeBSD$ 1999-08-28 00:22:10 +00:00
Peter Wemm
7e546392b5 Revert $FreeBSD$ to $Id$ 1997-02-22 15:12:41 +00:00
Jordan K. Hubbard
1130b656e5 Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore.  This update would have been
insane otherwise.
1997-01-14 07:20:47 +00:00
Andrey A. Chernov
38aa46cdf0 Rename collate_range_cmp to __collate_range_cmp for internal usage
inside libc. Add collate_range_cmp as alias to __collate_range_cmp
for temp. backward compatibility.
collate_range_cmp will be replaced with direct code for each
external program for compatibility with the rest of world
1996-10-31 04:25:14 +00:00
Andrey A. Chernov
1642f84deb Save half of space in LC_COLLATE and remove unneded code.
This change is not compatible with previous variant, however proper
error code returned in both cases.
Colldef changes will follows.
1996-10-15 21:53:23 +00:00
Andrey A. Chernov
b339a4060f Remove old version hooks 1996-08-12 19:18:47 +00:00
Andrey A. Chernov
2eecfbac3a Add internal function __collcmp once instead of adding it statically
to many places in the libc
1996-08-12 03:40:37 +00:00
Andrey A. Chernov
c3d0cca4e9 Add 8-bit collate stuff
Submitted by: alex@elvisti.kiev.ua
1995-02-16 04:24:39 +00:00