Commit Graph

55 Commits

Author SHA1 Message Date
Thomas Munro
cc7edd258c Add collation version support to querylocale(3).
Provide a way to ask for an opaque version string for a locale_t, so
that potential changes in sort order can be detected.  Similar to
ICU's ucol_getVersion() and Windows' GetNLSVersionEx(), this API is
intended to allow databases to detect when text order-based indexes
might need to be rebuilt.

The CLDR version is extracted from CLDR source data by the Makefile
under tools/tools/locale, written into the machine-generated Makefile
under shared/colldef, passed to localedef -V, and then written into
LC_COLLATE file headers.  The initial version is 34.0.
tools/tools/locale was recently updated to pull down 35.0, but the
output hasn't been committed under share/colldef yet, so that will
provide the first observable change when it happens.  Other versioning
schemes are possible in future, because the format is unspecified.

Reviewed by:	bapt, 0mp, kib, yuripv (albeit a long time ago)
Differential Revision:	https://reviews.freebsd.org/D17166
2020-11-08 02:50:34 +00:00
Gordon Bergling
3d265fce43 Fix a few mandoc issues
- skipping paragraph macro: Pp after Sh
- sections out of conventional order: Sh EXAMPLES
- whitespace at end of input line
- normalizing date format
2020-10-09 19:12:44 +00:00
Kyle Evans
ecebb3cc1d Only set WARNS if not defined
This would allow interested parties to do experimental runs with an
environment set appropriately to raise all the warnings throughout the
build; e.g. env WARNS=6 NO_WERROR=yes buildworld.

Not currently touching the numerous instances in ^/tools.

MFC after:	1 week
2020-09-11 13:28:37 +00:00
Alex Richardson
00c61a3b43 Allow bootstrapping localdef on non-FreeBSD systems
The current localedef simply assumes that the locale headers on build system
are compatible with those on the target system which is not necessarily true.
It generally works on FreeBSD (as long as we don't change the locale headers),
but Linux and macOS provide completely different locale headers.

This change adds new bootstrap headers that namespace certain xlocale
structures defined or used by in the headers that localdef needs.
This is required since system headers *must* be able to include the "real"
locale headers for printf(), etc., but we also want to access the target
systems's internal locale structures.

Reviewed By: yuripv, brooks
Differential Revision: https://reviews.freebsd.org/D25229
2020-07-15 12:07:59 +00:00
Alex Richardson
2becc9efb5 Add missing newline and return in localedef error message
I hit those error messages when using a localedef built against headers
that don't match the target system (cross-building from a Linux host).
This problem will be fixed in the next commit.
2020-07-15 12:07:53 +00:00
Jung-uk Kim
d836a9dbe3 Fix build with recent byacc. 2020-06-24 02:08:08 +00:00
Yuri Pankov
2d1cfed1b1 localedef: define characters in "space" class also as "print", except
for the known conflicts ("control" characters can't be "print"able).
POSIX doesn't explicitly forbid this, and actually includes <space>
character in "print".

PR:		225692
Reviewed by:	bapt, cem (previous version), pfg (previous version)
Approved by:	kib (mentor)
Differential Revision:	https://reviews.freebsd.org/D17467
2018-10-27 23:31:42 +00:00
Yuri Pankov
4644f9bef6 Add -b/-l options to localedef(1) to specify output endianness and use
it appropriately when building share/ctypedef and share/colldef.

This makes the resulting locale data in EL->EB (amd64->powerpc64) cross
build and in the native EB build match.  Revert the changes done to libc
in r308170 as they are no longer needed.

PR:		231965
Reviewed by:	bapt, emaste, sbruno, 0mp
Approved by:	kib (mentor)
Differential Revision:	https://reviews.freebsd.org/D17603
2018-10-20 20:51:05 +00:00
Pedro F. Giffuni
add14e43f8 localedef(1): remove duplicated includes.
Hinted by:	DragonFlyBSD
2018-07-09 20:38:47 +00:00
Eitan Adler
dae3a64fb9 userland: Fix several typos and minor errors
- duplicate words
- typos
- references to old versions of FreeBSD

Reviewed by:	imp, benno
2017-12-27 03:23:01 +00:00
Bryan Drewery
ea825d0274 DIRDEPS_BUILD: Update dependencies.
Sponsored by:	Dell EMC Isilon
2017-10-31 00:07:04 +00:00
Enji Cooper
ead8d64aed Mark errf _Noreturn, and mark errf and warn __printflike
The _Noreturn attribute was added to placate Coverity and other static
analysis tools. The __printflike attribute was added to catch issues
with the calls related to printf(3) abuse.

- Modify the code to facilitate the __printflike attribute addition.
- Convert errf calls in to_mb(..) and to_mb_string(..) to warn(..) so
  the calls will return instead of exiting, as the code suggests it
  should.

Differential Revision:	D10704
MFC after:	1 month
Reviewed by:	pfg
Sponsored by:	Dell EMC Isilon
2017-05-14 18:47:09 +00:00
Enji Cooper
6d284c0153 style(9): sort headers
MFC after:	3 weeks
Sponsored by:	Dell EMC Isilon
2017-05-13 19:59:03 +00:00
Warner Losh
a35f04fba2 Adopt SRCTOP in usr.bin
Prefer ${SRCTOP}/foo over ${.CURDIR}/../../foo and ${SRCTOP}/usr.bin/foo
over ${.CURDIR}/../foo for paths in Makefiles.

Differential Revision:	https://reviews.freebsd.org/D9932
Sponsored by:		Netflix
Silence on:		arch@ (twice)
2017-03-12 18:58:44 +00:00
Pedro F. Giffuni
e12a957f8d localedef(1): Add comment markings for license. 2017-03-10 16:12:16 +00:00
Pedro F. Giffuni
56b1edd680 localedef(1): Fix mismatch.
Obtained from:	illumos
X-MFC with:	r314974
2017-03-10 16:06:14 +00:00
Pedro F. Giffuni
1bb0ddf99d localedef(1): Fix small coverity issues.
- Operands don't affect result (CONSTANT_EXPRESSION_RESULT)
- Buffer not null terminated (BUFFER_SIZE_WARNING)

CID:	1338557, 1338565

Obtained from:	illumos
MFC after:	5 days
2017-03-09 21:49:11 +00:00
Pedro F. Giffuni
c48c87b790 Revert r314969, r314961:
The localdef(1) changes are breaking world:

00:18:40.750 /usr/src/share/colldef/af_ZA.UTF-8.src: 2421: error: Bad file
descriptor

I will fix them offline.

Reported by:	lwshu and many others
2017-03-09 19:02:36 +00:00
Pedro F. Giffuni
5f6fcdca5b localedef(1): Fix mismatch in previous commit.
delete_category is meant to replace fclose() and unlink().
This broke world.

Found by:	kib
Pointedhat:	pfg
2017-03-09 18:06:48 +00:00
Pedro F. Giffuni
830784ef0f localedef(1): Fix for memory leaks reported by coverity.
Also some small cleanups to match better current illumos.

CID: 1338540, 1338541, 1338557, 1338566

Obtained from:	illumos
Discussed with:	Yuri Pankov (@Nexenta)
MFC after:	5 days
2017-03-09 15:21:03 +00:00
Baptiste Daroussin
bbf9a45630 localedef: Improve cc_list parsing
original commit log:
=====
I had originally suspected the parsing of ctype definition files as being
the source of the ctype flag mis-definitions, but it wasn't.  In the
process, I simplified the cc_list parsing so I'm committing the no-impact
improvement separately.  It removes some parsing redundancies and
won't parse partial range definitions anymore.
====

Submitted by:	marino
Obtained from:	Dragonfly
MFC after:	1 month
2016-10-06 19:51:30 +00:00
Baptiste Daroussin
c7edf4fd0b localedef: Fix ctype dump (fixed wide spread errors)
This commit is from John Marino in dragonfly with the following commit log:

====
This was a CTYPE encoding error involving consecutive points of the same
ctype.  It was reported by myself to Illumos over a year ago but I was
unsure if it was only happening on BSD.  Given the cause, the bug is also
present on Illumos.

Basically, if consecutive points were of the exact same ctype, they would
be defined as a range regardless.  For example, all of these would be
considered equivalent:

  <A> ... <C>, <H>  (converts to <A> .. <H>)
  <A>, <B>, <H>     (converts to <A> .. <H>)
  <A>, <J> ... <H>  (converts to <A> .. <H>)

So all the points that shouldn't have been defined got "bridged" by the
extreme points.

The effects were recently reported to FreeBSD on PR 213013.  There are
countless places were the ctype flags are misdefined, so this is a major
fix that has to be MFC'd.
====

This reveals a bad change I did on the testsuite: while 0x07FF is a valid
unicode it is not used yet (reserved for future use)

PR:		213013
Submitted by:	marino@
Reported by:	Kurtis Rader <krader@skepticism.us>
Obtained from:	Dragonfly
MFC after:	1 month
2016-10-06 19:46:43 +00:00
Pedro F. Giffuni
be4391a2d5 localedef(1): make better use of calloc(3) arguments.
The first argument of calloc(3) should be an ordinal type, and the
second a size: split a multiplication to make better use of calloc(3)
and detect overflows.

Do some other re-ordering and style fixes while here.

MFC after:	3 weeks
2016-09-14 16:47:17 +00:00
Marcelo Araujo
4c22fda976 - Invert calloc(3) argument order.
MFC after:	4 weeks
2016-09-01 15:23:33 +00:00
Pedro F. Giffuni
fcc7baa1ae localedef(1): minor spelling fixes on comments.
No functional change.
2016-05-01 16:10:56 +00:00
Pedro F. Giffuni
0b33b55b01 Small typo. 2016-04-28 15:20:08 +00:00
Baptiste Daroussin
e6d8c0e2dd Plug memory leaks
Reported by:	Coverity
CID=		1338535, 1338536, 1338542, 1338569, 1338570
2016-04-20 21:23:42 +00:00
Pedro F. Giffuni
046c3cda83 localedef(1): minor sorting to match Illumos.
Illumos recently included space in 'print' class. We already had
this but the code had slight sorting differences. Move it some
lines up to reduce diffs with Illumos.

No functional change.

Reference:
https://illumos.org/issues/5227
2016-03-20 03:27:06 +00:00
Bryan Drewery
bd18fd57db DIRDEPS_BUILD: Regenerate without local dependencies.
These are no longer needed after the recent 'beforebuild: depend' changes
and hooking DIRDEPS_BUILD into a subset of FAST_DEPEND which supports
skipping 'make depend'.

Sponsored by:	EMC / Isilon Storage Division
2016-02-24 17:20:11 +00:00
Bryan Drewery
393608176b META MODE: Fix 'make the-lot' with recent locale changes
Sponsored by:	EMC / Isilon Storage Division
2015-11-25 19:13:28 +00:00
Baptiste Daroussin
c5aac62ae4 lower again the warnings and remove the pragmas unsupported by gcc 4.2.1 2015-11-08 22:23:21 +00:00
Baptiste Daroussin
55b270e68c Eliminate some gcc pragmas 2015-11-08 21:22:24 +00:00
Baptiste Daroussin
8c859b074e Fix build of localedef(1) on arm where wchar_t is an unsigned int 2015-11-07 22:57:00 +00:00
Baptiste Daroussin
00d10c2c70 Rewrite the histoty part
Fix information about "Dragonfly-style" format which on freebsd is named
BSD-style

Noted by:	bdrewery
2015-11-07 21:07:40 +00:00
Baptiste Daroussin
5b3b54e06c Improve localedef(1) manpage
Obtained from:	DragonflyBSD
2015-11-07 20:36:54 +00:00
Baptiste Daroussin
29660f86e2 Bump warning level 2015-11-07 20:31:23 +00:00
Baptiste Daroussin
a0e395a47f Use const where needed instead of using pragmas to work around the warnings 2015-11-07 20:29:23 +00:00
Baptiste Daroussin
557a07f08a Make bsd declaration static 2015-11-07 20:27:31 +00:00
Baptiste Daroussin
5d21db0905 Fix an off by one due to bad conversion from avl(3) to tree(3)
Readd calloc as it was not the issue just the messenger

Submitted by:	dim
Found by:	Address Sanitizer
2015-11-07 19:54:40 +00:00
Baptiste Daroussin
e12838d367 Run memset only after having checked the return of malloc
Submitted by:	pluknet
2015-11-07 16:45:51 +00:00
Baptiste Daroussin
6cdc211add Workaround an issue on i386 to unbreak the build until the real issue is tracked
down
2015-11-07 16:22:29 +00:00
Baptiste Daroussin
78be8e6732 Fix build on arm64 2015-11-07 15:03:45 +00:00
Baptiste Daroussin
99b72f8fa4 Add missing header 2015-11-07 12:11:17 +00:00
Baptiste Daroussin
9f3e8dc233 Fix typo 2015-11-07 11:08:19 +00:00
Baptiste Daroussin
d79cdd21de libc: Fix (and improve) nl_langinfo (CODESET)
The output of "locale charmap" is identical to the result of
nl_langinfo (CODESET) for any given locale.  The logic for returning the
codeset was very simplistic.  It just returned portion of the locale name
after the period (e.g. en_FR.ISO8859-1 returned "ISO8859-1").

When softlinks were added to locales, this broke.  e.g.:
   en_US returned ""
   en_FR.UTF8 returned "UTF8"
   en_FR.UTF-8 returned "UTF-8"
   zh_Hant_HK.Big5HKSCS returned "Big5HKSCS"
   zh_Hant_TW.Big5 returned "Big5"
   es_ES@euro returned ""

In order to fix this properly, the named locale cannot be used to
determine the encoding.  This information was almost available in the
rune data.  Unfortunately, all the single byte encodings were listed
as "NONE" encoding.

So I adjusted localedef tool to provide more information about the
encoding.  For example, instead of "NONE", the LC_CTYPE used by
fr_FR.ISO8859-15 is now encoded as "NONE:ISO8859-15".  The locale
handlers now check if the first four characters of the encoding is
"NONE" and if so, treats it as a single-byte encoding.

The nl_langinfo handling of CODESET was adjusting accordingly.  Now the
following is returned:
   en_US returns "ISO8859-1"
   fr_FR.UTF8 returns "UTF-8"
   fr_FR.UTF-8 returns "UTF-8"
   zh_Hant_HK.Big5HKSCS returns "Big5"
   zh_Hant_TW.Big5 returns "Big5"
   es_ES@euro returns "ISO8859-15"

as before, "C" and "POSIX" locales return "US-ASCII".  This is a big
improvement.  The result of nl_langinfo can never be a zero-length
string and it will always exclusively one of the values of the
character maps of /usr/src/tools/tools/locale/etc/final-maps.

Submitted by:	marino
Obtained from:	DragonflyBSD
2015-11-01 12:00:55 +00:00
Baptiste Daroussin
71e8badedc Actually only T_ISDIGIT should be flagged as _E4 2015-10-19 14:48:31 +00:00
Baptiste Daroussin
227d35dac0 With regard to ctype, digits (e.g. 0 to 9) and xdigits (the 0 to 9 portion
of hexidecimal numbers) are all considered "numbers".  (Note that while
all digits are numbers, not all numbers are digits).

Enhance localedef to automatically set the "number" characteristic when
it encounters a digit or xdigit definition. This fixes malfunctionning
isalnum(3)

Obtained from:	DragonflyBSD
2015-10-19 14:30:28 +00:00
Baptiste Daroussin
8833f5e9c2 eliminate need for "print" definition
By having space automatically classified as "print" type, we can
eliminate the print section from ctype src files completely (they
are just "graph" plus "<space>".

Obtained from:	Dragonfly
2015-10-13 20:45:29 +00:00
Baptiste Daroussin
f5dde0166d Commit log from Dragonfly:
FreeBSD extended ctypes to include numbers (e.g. isnumber()) but never
actually implemented it.  The isnumber() function was equivalent to the
isdigit() function in every case.

Now that DragonFly's ctype source files have number definitions, the
number ctype can finally be implemented.  It's given a new flag _CTYPE_N.
The isalnum() and iswalnum() functions have been changed to use this
flag rather than the _CTYPE_D digit flag.

While isalnum(), isnumber(), and their wide equivalents now return
different values in locale cases, the ishexnumber() and iswhexnumber()
functions are unchanged.  They are still aliases for isxdigit() and
iswxdigit().

Also change ctype.h for isdigit and isxdigit to use sbistype like the
other functions.

Obtained from:	dragonfly
2015-10-13 20:43:49 +00:00
Baptiste Daroussin
23a32822d2 Merge from HEAD 2015-08-25 20:14:50 +00:00