1995-02-17 17:29:50 +00:00
|
|
|
.\" Copyright (c) 1995 Alex Tatmanjants <alex@elvisti.kiev.ua>
|
|
|
|
.\" at Electronni Visti IA, Kiev, Ukraine.
|
|
|
|
.\" All rights reserved.
|
|
|
|
.\"
|
|
|
|
.\" Redistribution and use in source and binary forms, with or without
|
|
|
|
.\" modification, are permitted provided that the following conditions
|
|
|
|
.\" are met:
|
|
|
|
.\" 1. Redistributions of source code must retain the above copyright
|
|
|
|
.\" notice, this list of conditions and the following disclaimer.
|
|
|
|
.\" 2. Redistributions in binary form must reproduce the above copyright
|
|
|
|
.\" notice, this list of conditions and the following disclaimer in the
|
|
|
|
.\" documentation and/or other materials provided with the distribution.
|
|
|
|
.\"
|
|
|
|
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND
|
|
|
|
.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|
|
|
.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|
|
|
.\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE
|
|
|
|
.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|
|
|
.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|
|
|
.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|
|
|
.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|
|
|
.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|
|
|
.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|
|
|
.\" SUCH DAMAGE.
|
|
|
|
.\"
|
1999-08-28 01:08:13 +00:00
|
|
|
.\" $FreeBSD$
|
1997-06-26 11:25:17 +00:00
|
|
|
.\"
|
2000-12-11 15:47:53 +00:00
|
|
|
.Dd January 27, 1995
|
1995-02-17 17:29:50 +00:00
|
|
|
.Dt COLLDEF 1
|
|
|
|
.Os
|
|
|
|
.Sh NAME
|
|
|
|
.Nm colldef
|
|
|
|
.Nd convert collation sequence source definition
|
|
|
|
.Sh SYNOPSIS
|
2000-11-20 19:21:22 +00:00
|
|
|
.Nm
|
1996-10-15 22:05:18 +00:00
|
|
|
.Op Fl I Ar map_dir
|
|
|
|
.Op Fl o Ar out_file
|
|
|
|
.Op Ar filename
|
1995-02-17 17:29:50 +00:00
|
|
|
.Sh DESCRIPTION
|
2002-04-20 12:18:28 +00:00
|
|
|
The
|
|
|
|
.Nm
|
|
|
|
utility converts a collation sequence source definition
|
1995-02-17 17:29:50 +00:00
|
|
|
into a format usable by the
|
|
|
|
.Fn strxfrm
|
|
|
|
and
|
|
|
|
.Fn strcoll
|
2000-03-01 12:20:22 +00:00
|
|
|
functions.
|
|
|
|
It is used to define the many ways in which
|
1995-02-17 17:29:50 +00:00
|
|
|
strings can be ordered and collated.
|
2003-09-07 14:33:07 +00:00
|
|
|
The
|
1995-02-17 17:29:50 +00:00
|
|
|
.Fn strxfrm
|
2003-09-07 14:33:07 +00:00
|
|
|
function transforms
|
1995-02-17 17:29:50 +00:00
|
|
|
its first argument and places the result in its second
|
2000-03-01 12:20:22 +00:00
|
|
|
argument.
|
|
|
|
The transformed string is such that it can be
|
1995-02-17 17:29:50 +00:00
|
|
|
correctly ordered with other transformed strings by using
|
|
|
|
.Fn strcmp ,
|
|
|
|
.Fn strncmp ,
|
|
|
|
or
|
|
|
|
.Fn memcmp .
|
2003-09-07 14:33:07 +00:00
|
|
|
The
|
1995-02-17 17:29:50 +00:00
|
|
|
.Fn strcoll
|
2003-09-07 14:33:07 +00:00
|
|
|
function transforms its arguments and does a
|
1995-02-17 17:29:50 +00:00
|
|
|
comparison.
|
|
|
|
.Pp
|
2002-04-20 12:18:28 +00:00
|
|
|
The
|
|
|
|
.Nm
|
|
|
|
utility reads the collation sequence source definition
|
1995-02-17 17:29:50 +00:00
|
|
|
from the standard input and stores the converted definition in filename.
|
|
|
|
The output file produced contains the
|
|
|
|
database with collating sequence information in a form
|
|
|
|
usable by system commands and routines.
|
|
|
|
.Pp
|
2003-09-07 14:33:07 +00:00
|
|
|
The following options are available:
|
|
|
|
.Bl -tag -width indent
|
2001-02-28 17:38:53 +00:00
|
|
|
.It Fl I Ar map_dir
|
2003-09-07 14:33:07 +00:00
|
|
|
Set directory name where
|
1996-10-15 22:05:18 +00:00
|
|
|
.Ar charmap
|
|
|
|
files can be found, current directory by default.
|
2001-02-28 17:38:53 +00:00
|
|
|
.It Fl o Ar out_file
|
2003-09-07 14:33:07 +00:00
|
|
|
Set output file name,
|
1996-10-15 22:05:18 +00:00
|
|
|
.Ar LC_COLLATE
|
|
|
|
by default.
|
|
|
|
.El
|
|
|
|
.Pp
|
1995-02-17 17:29:50 +00:00
|
|
|
The collation sequence definition specifies a set of collating elements and
|
|
|
|
the rules defining how strings containing these should be ordered.
|
|
|
|
This is most useful for different language definitions.
|
|
|
|
.Pp
|
|
|
|
The specification file can consist of three statements:
|
1996-10-15 22:05:18 +00:00
|
|
|
.Ar charmap ,
|
1995-02-17 17:29:50 +00:00
|
|
|
.Ar substitute
|
1996-10-15 22:05:18 +00:00
|
|
|
and
|
|
|
|
.Ar order .
|
|
|
|
.Pp
|
|
|
|
Of these, only the
|
1995-02-17 17:29:50 +00:00
|
|
|
.Ar order
|
2000-03-01 12:20:22 +00:00
|
|
|
statement is required.
|
|
|
|
When
|
1996-10-15 22:05:18 +00:00
|
|
|
.Ar charmap
|
|
|
|
or
|
|
|
|
.Ar substitute
|
|
|
|
is
|
2000-03-01 12:20:22 +00:00
|
|
|
supplied, these statements must be ordered as above.
|
|
|
|
Any
|
1995-02-17 17:29:50 +00:00
|
|
|
statements after the order statement are ignored.
|
|
|
|
.Pp
|
1996-10-15 22:05:18 +00:00
|
|
|
Lines in the specification file beginning with a
|
2004-05-19 09:45:46 +00:00
|
|
|
.Ql #
|
1996-10-15 22:05:18 +00:00
|
|
|
are
|
2000-03-01 12:20:22 +00:00
|
|
|
treated as comments and are ignored.
|
|
|
|
Blank lines are also
|
1995-02-17 17:29:50 +00:00
|
|
|
ignored.
|
|
|
|
.Pp
|
2004-05-19 09:45:46 +00:00
|
|
|
.Dl "charmap charmapfile"
|
1996-10-15 22:05:18 +00:00
|
|
|
.Pp
|
2003-09-07 14:33:07 +00:00
|
|
|
.Ar Charmap
|
1995-02-17 17:29:50 +00:00
|
|
|
defines where a mapping of the character
|
|
|
|
and collating element symbols to the actual
|
|
|
|
character encoding can be found.
|
|
|
|
.Pp
|
|
|
|
The format of
|
|
|
|
.Ar charmapfile
|
2000-03-01 12:20:22 +00:00
|
|
|
is shown below.
|
|
|
|
Symbol
|
1995-02-17 17:29:50 +00:00
|
|
|
names are separated from their values by TAB or
|
2004-05-19 09:45:46 +00:00
|
|
|
SPACE characters.
|
|
|
|
Symbol-value can be specified in
|
1996-10-15 22:05:18 +00:00
|
|
|
a hexadecimal (\ex\fI??\fR) or octal (\e\fI???\fR)
|
1995-02-17 17:29:50 +00:00
|
|
|
representation, and can be only one character in length.
|
2003-09-07 14:33:07 +00:00
|
|
|
.Pp
|
|
|
|
.Bd -literal -offset indent
|
|
|
|
symbol-name1 symbol-value1
|
|
|
|
symbol-name2 symbol-value2
|
|
|
|
\&...
|
2000-11-10 17:46:15 +00:00
|
|
|
.Ed
|
1995-02-17 17:29:50 +00:00
|
|
|
.Pp
|
2003-09-07 14:33:07 +00:00
|
|
|
Symbol names cannot be specified in
|
|
|
|
.Ar substitute
|
|
|
|
fields.
|
|
|
|
.Pp
|
1996-10-15 22:05:18 +00:00
|
|
|
The
|
|
|
|
.Ar charmap
|
|
|
|
statement is optional.
|
|
|
|
.Pp
|
2003-09-07 14:33:07 +00:00
|
|
|
.Bd -literal -offset indent
|
|
|
|
substitute "symbol" with "repl_string"
|
|
|
|
.Ed
|
1995-02-17 17:29:50 +00:00
|
|
|
.Pp
|
|
|
|
The
|
|
|
|
.Ar substitute
|
|
|
|
statement substitutes the character
|
1999-02-13 14:14:47 +00:00
|
|
|
.Ar symbol
|
1995-02-17 17:29:50 +00:00
|
|
|
with the string
|
1999-02-13 14:14:47 +00:00
|
|
|
.Ar repl_string .
|
|
|
|
Symbol names cannot be specified in
|
|
|
|
.Ar repl_string
|
|
|
|
field.
|
1996-10-15 22:05:18 +00:00
|
|
|
The
|
|
|
|
.Ar substitute
|
|
|
|
statement is optional.
|
1995-02-17 17:29:50 +00:00
|
|
|
.Pp
|
2004-05-19 09:45:46 +00:00
|
|
|
.Dl "order order_list"
|
1996-10-15 22:05:18 +00:00
|
|
|
.Pp
|
2003-09-07 14:33:07 +00:00
|
|
|
.Ar Order_list
|
1996-10-16 03:12:22 +00:00
|
|
|
is a list of symbols, separated by semi colons, that defines the
|
2000-03-01 12:20:22 +00:00
|
|
|
collating sequence.
|
|
|
|
The
|
1996-10-15 22:05:18 +00:00
|
|
|
special symbol
|
1995-02-17 17:29:50 +00:00
|
|
|
.Ar ...
|
1996-10-15 22:05:18 +00:00
|
|
|
specifies, in a short-hand
|
1995-02-17 17:29:50 +00:00
|
|
|
form, symbols that are sequential in machine code
|
|
|
|
order.
|
|
|
|
.Pp
|
1996-10-16 03:12:22 +00:00
|
|
|
An order list element
|
1995-02-17 17:29:50 +00:00
|
|
|
can be represented in any one of the following
|
|
|
|
ways:
|
1996-10-15 22:05:18 +00:00
|
|
|
.Bl -bullet
|
|
|
|
.It
|
|
|
|
The symbol itself (for example,
|
1995-02-17 17:29:50 +00:00
|
|
|
.Ar a
|
|
|
|
for the lower-case letter
|
2003-09-07 14:33:07 +00:00
|
|
|
.Ar a ) .
|
1996-10-15 22:05:18 +00:00
|
|
|
.It
|
2002-08-23 04:18:26 +00:00
|
|
|
The symbol in octal representation (for example,
|
1995-02-17 17:29:50 +00:00
|
|
|
.Ar \e141
|
|
|
|
for the letter
|
2003-09-07 14:33:07 +00:00
|
|
|
.Ar a ) .
|
1996-10-15 22:05:18 +00:00
|
|
|
.It
|
2002-08-23 04:18:26 +00:00
|
|
|
The symbol in hexadecimal representation (for example,
|
1995-02-17 17:29:50 +00:00
|
|
|
.Ar \ex61
|
|
|
|
for the letter
|
2003-09-07 14:33:07 +00:00
|
|
|
.Ar a ) .
|
1996-10-15 22:05:18 +00:00
|
|
|
.It
|
|
|
|
The symbol name as defined in the
|
|
|
|
.Ar charmap
|
|
|
|
file (for example,
|
2002-08-23 04:18:26 +00:00
|
|
|
.Ar <letterA>
|
1995-02-17 17:29:50 +00:00
|
|
|
for
|
2002-08-23 04:18:26 +00:00
|
|
|
.Ar letterA \e023
|
1995-02-17 17:29:50 +00:00
|
|
|
record in
|
1996-10-15 22:05:18 +00:00
|
|
|
.Ar charmapfile ) .
|
|
|
|
If character map name have
|
|
|
|
.Ar >
|
|
|
|
character, it must be escaped as
|
|
|
|
.Ar /> ,
|
|
|
|
single
|
|
|
|
.Ar /
|
|
|
|
must be escaped as
|
|
|
|
.Ar // .
|
|
|
|
.It
|
|
|
|
Symbols
|
|
|
|
.Ar \ea ,
|
|
|
|
.Ar \eb ,
|
|
|
|
.Ar \ef ,
|
|
|
|
.Ar \en ,
|
|
|
|
.Ar \er ,
|
|
|
|
.Ar \ev
|
2003-09-07 14:33:07 +00:00
|
|
|
are permitted in its usual C-language meaning.
|
1996-10-16 03:12:22 +00:00
|
|
|
.It
|
2002-08-23 04:18:26 +00:00
|
|
|
The symbol chain (for example:
|
|
|
|
.Ar abc ,
|
2002-08-23 14:03:59 +00:00
|
|
|
.Ar <letterA><letterB>c ,
|
|
|
|
.Ar \exf1b\exf2 )
|
2002-08-23 04:18:26 +00:00
|
|
|
.It
|
1996-10-16 03:12:22 +00:00
|
|
|
The symbol range (for example,
|
2003-09-07 14:33:07 +00:00
|
|
|
.Ar a;...;z ) .
|
1996-10-16 03:12:22 +00:00
|
|
|
.It
|
|
|
|
Comma-separated symbols, ranges and chains enclosed in parenthesis (for example
|
|
|
|
.Ar \&(
|
|
|
|
.Ar sym1 ,
|
|
|
|
.Ar sym2 ,
|
|
|
|
.Ar ...
|
|
|
|
.Ar \&) )
|
|
|
|
are assigned the
|
|
|
|
same primary ordering but different secondary
|
|
|
|
ordering.
|
|
|
|
.It
|
|
|
|
Comma-separated symbols, ranges and chains enclosed in curly brackets (for example
|
|
|
|
.Ar \&{
|
|
|
|
.Ar sym1 ,
|
|
|
|
.Ar sym2 ,
|
|
|
|
.Ar ...
|
|
|
|
.Ar \&} )
|
|
|
|
are assigned the same primary ordering only.
|
1995-02-17 17:29:50 +00:00
|
|
|
.El
|
|
|
|
.Pp
|
1996-10-15 22:05:18 +00:00
|
|
|
The backslash character
|
1995-02-17 17:29:50 +00:00
|
|
|
.Ar \e
|
2000-03-01 12:20:22 +00:00
|
|
|
is used for continuation.
|
|
|
|
In this case, no characters are permitted
|
1996-10-16 03:12:22 +00:00
|
|
|
after the backslash character.
|
2005-01-17 07:44:44 +00:00
|
|
|
.Sh EXIT STATUS
|
2002-04-20 12:18:28 +00:00
|
|
|
The
|
|
|
|
.Nm
|
|
|
|
utility exits with the following values:
|
2000-11-17 11:44:16 +00:00
|
|
|
.Bl -tag -width indent
|
|
|
|
.It Li 0
|
|
|
|
No errors were found and the output was successfully created.
|
|
|
|
.It Li !=0
|
1995-02-17 17:29:50 +00:00
|
|
|
Errors were found.
|
2000-11-17 11:44:16 +00:00
|
|
|
.El
|
1995-02-17 17:29:50 +00:00
|
|
|
.Sh FILES
|
2003-09-14 13:41:59 +00:00
|
|
|
.Bl -tag -width indent
|
|
|
|
.It Pa /usr/share/locale/ Ns Ao Ar language Ac Ns Pa /LC_COLLATE
|
|
|
|
The standard shared location for collation orders
|
|
|
|
under the locale
|
|
|
|
.Aq Ar language .
|
2002-10-16 13:00:42 +00:00
|
|
|
.El
|
1995-02-17 17:29:50 +00:00
|
|
|
.Sh SEE ALSO
|
|
|
|
.Xr mklocale 1 ,
|
1997-01-11 19:58:11 +00:00
|
|
|
.Xr setlocale 3 ,
|
1995-02-17 17:29:50 +00:00
|
|
|
.Xr strcoll 3 ,
|
|
|
|
.Xr strxfrm 3
|