Tim J. Robbins
26bdc1d12e
Tweak markup of quoted strings and characters: use Dq instead of enclosing
...
strings in ``obsolete quotes''. Use Li and Ql where appropriate.
2004-07-23 06:06:58 +00:00
Tim J. Robbins
0b651019b4
Add a lengthy discussion of why "tr a-z A-Z" and "tr A-Z a-z" are not the
...
right way to perform case-conversion.
2004-07-23 05:44:04 +00:00
Tim J. Robbins
c75b843169
Fix description of cmap_lookup_hard().
2004-07-14 08:36:09 +00:00
Tim J. Robbins
9aed43ae23
Remove unused member of struct csclass: csc_value.
2004-07-14 08:35:11 +00:00
Tim J. Robbins
cfab3bdd89
Splay the left and right subtrees on min - 1 and max + 1, respectively,
...
before trying to coalesce. Forgetting to splay caused us to miss many
opportunities for coalescing.
2004-07-14 08:33:14 +00:00
Tim J. Robbins
9c8fd487a5
Initialize cs_invert to "false" in new csets.
2004-07-10 06:28:18 +00:00
Tim J. Robbins
9409835314
Report input errors instead of ignoring them.
2004-07-09 05:15:46 +00:00
Tim J. Robbins
e263a4b46e
Update for multibyte character support: remove BUGS and change the
...
description of the -c option to refer to "values" instead of "byte values".
2004-07-09 02:33:46 +00:00
Tim J. Robbins
ca99cfdd14
Add support for multibyte characters. The challenge here was to use
...
data structures that scale better with large character sets, instead of
arrays indexed by character value:
- Sets of characters to delete/squeeze are stored in a new "cset" structure,
which is implemented as a splay tree of extents. This structure has the
ability to store character classes (ala wctype(3)), but this is not
currently fully utilized.
- Mappings between characters are stored in a new "cmap" structure, which
is also a splay tree.
- The parser no longer builds arrays containing all the characters in a
particular class; instead, next() determines them on-the-fly using
nextwctype(3).
2004-07-09 02:08:07 +00:00
Ruslan Ermilov
6a3e8b0adc
Mechanically kill hard sentence breaks.
2004-07-02 22:22:35 +00:00
Tim J. Robbins
6863e5bed2
Document incorrect handling of multibyte characters in input files
...
and character string arguments.
2004-06-28 07:19:11 +00:00
Andrey A. Chernov
035944c3b6
Back out [:upper:] and [:lower:] classes sorting, it is not required
...
by POSIX and gains nothing with current code.
2003-08-05 07:59:46 +00:00
Andrey A. Chernov
8ad968ee96
Clarify upper/lower conversion description more.
2003-08-05 07:53:28 +00:00
Andrey A. Chernov
bc44c44a14
Explain better what happens when [:lower:] <-> [:upper:]
2003-08-05 06:00:00 +00:00
Andrey A. Chernov
30c1156451
No functional changes, just code reorganization from prev. commit, it
...
makes one malloc unneeded, removes two bzero's and makes code more readable.
"Bright ideas comes only _after_ commits."
2003-08-04 05:22:06 +00:00
Andrey A. Chernov
21f53e9138
POSIX require complex processing of 'c-c' ranges: if one of the endpoints
...
is octal sequence, range is taken in the byte values order, for non-octal
endpoints range is taken in the sorted collation order.
Implement it.
2003-08-04 04:20:04 +00:00
Andrey A. Chernov
796263418b
Special fix just for
...
tr -[cC]s '[:upper:]' '[:lower:]'
case (or vice versa):
chars taken from s2 can be different this time
due to lack of complex upper/lower processing,
so fill string2 again to not miss some.
2003-08-04 02:57:17 +00:00
Andrey A. Chernov
d7da7302f9
Microoptimization of prev. patch: do strdup() only if (cflag || Cflag)
2003-08-03 22:19:43 +00:00
Andrey A. Chernov
e42eb6838e
1) Fix -C - it was broken since introduced, wrong array sorted
...
2) Fix last (repeated) char after [:class:], it was \0 in original code
2003-08-03 22:02:49 +00:00
Andrey A. Chernov
761c008c99
Remove charcoll() stabilization added in 1.16, it gains nothing but conflicts
...
with ranges.
2003-08-03 04:18:07 +00:00
Andrey A. Chernov
a508a04d43
POSIX requires 'c-c' must conform collate and be in collation order
2003-08-03 03:51:27 +00:00
Andrey A. Chernov
00611f0457
This patch address two problems.
...
1st one is relatively minor: according our own manpage, upper and lower
classes must be sorted, but currently not.
2nd one is serious:
tr '[:lower:]' '[:upper:]'
(and vice versa) currently works only if upper and lower classes
have exact the same number of elements. When it is not true, like for
many ISO8859-x locales which have bigger amount of lowercase letters,
tr may do nasty things.
See this page
http://www.opengroup.org/onlinepubs/007908799/xcu/tr.html
for detailed description of desired tr behaviour in such cases.
2003-08-03 02:23:39 +00:00
Jens Schweikhardt
d64ada501a
Fix typos, mostly s/ an / a / where appropriate and a few s/an/and/
...
Add FreeBSD Id tag where missing.
2002-12-30 21:18:15 +00:00
Ruslan Ermilov
06e482e60a
mdoc(7) police: markup polishing.
...
Approved by: re
2002-11-26 17:33:37 +00:00
Philippe Charnier
b9a86ec995
Use .Fl/Ar for flags and arguments.
2002-10-17 13:04:49 +00:00
David Malone
f4ac32def2
ANSIify function definitions.
...
Add some constness to avoid some warnings.
Remove use register keyword.
Deal with missing/unneeded extern/prototypes.
Some minor type changes/casts to avoid warnings.
Reviewed by: md5
2002-09-04 23:29:10 +00:00
Tim J. Robbins
6e9c52b638
When translating and -C is specified, behave as if the complemented set was
...
in the locale collating order as required by SUSv3.
2002-07-29 23:42:00 +00:00
Tim J. Robbins
482711cfa6
When translating and the -c option is specified, handle the case where the
...
second string argument is more than one character in length in the way
required by SUSv3 (and the way GNU textutils and SVR4 do it).
2002-07-29 14:50:54 +00:00
Tim J. Robbins
7dd4ac68f1
Use err instead of errx when malloc fails. "malloc" is not a helpful
...
error message.
2002-07-05 09:28:13 +00:00
Tim J. Robbins
232a0ff51d
Improve parsing of character and equivalence classes:
...
[:*] and [=*] are parsed as `infinitely many repetitions of :' (or *)
instead of literal characters (SUSv3)
2002-06-15 07:38:27 +00:00
Tim J. Robbins
dc20d4b9d4
Move the #include and #define's to the top of the file.
2002-06-14 15:56:52 +00:00
Tim J. Robbins
4efc23dabf
Bump the size of the equivalence set to NCHARS; this file was left out
...
of a previous commit implementing equivalence classes.
2002-06-14 15:53:38 +00:00
Tim J. Robbins
6eb0710e98
Sort sections. Avoid using "The -? option" at the start of option descriptions.
2002-06-14 10:11:41 +00:00
Tim J. Robbins
e73c3d279c
Don't treat the trailing ']' of an equivalence class expression as a
...
character in the set. tr -d '[=a=]' was deleting ]'s as well as a's.
Noticed by the textutils test suite.
2002-06-14 09:53:11 +00:00
Tim J. Robbins
dfac4f3695
Add the P1003.1-2001 -C option which complements the set of characters
...
(not byte values) specified by the first string argument.
2002-06-14 08:58:30 +00:00
Tim J. Robbins
85f6c317ea
Implement support for equivalence classes ([=e=]) when the mapping is
...
one-to-one (SUSv3)
2002-06-14 07:37:08 +00:00
Warner Losh
3f330d7d1a
remove __P
2002-03-22 01:42:45 +00:00
Alfred Perlstein
40e8dd712c
properly handle zero length first string when doing -c
...
PR: 34663
MFC After: 3 days
2002-03-02 10:36:37 +00:00
Mark Murray
787324755c
WARNS=2 fixes, use __FBSDID(), kill register keyword.
2001-12-11 23:36:25 +00:00
Ruslan Ermilov
d628d776c4
mdoc(7) police: utilize the new .Ex macro.
2001-08-15 09:09:47 +00:00
Ruslan Ermilov
753d686d34
mdoc(7) police: s/BSD/.Bx/ where appropriate.
2001-08-14 10:01:54 +00:00
Dima Dorfman
f247324df7
Remove whitespace at EOL.
2001-07-15 08:06:20 +00:00
Ruslan Ermilov
9597e1c260
mdoc(7) police: -column lists require column width specifiers.
2001-07-06 10:07:43 +00:00
Ruslan Ermilov
d0353b836e
mdoc(7) police: split punctuation characters + misc fixes.
2001-02-01 16:38:02 +00:00
Ruslan Ermilov
9b88faecd3
Prepare for mdoc(7)NG.
2000-12-19 16:00:12 +00:00
Ruslan Ermilov
8fe908ef0c
mdoc(7) police: use the new features of the Nm macro.
2000-11-20 19:21:22 +00:00
Ruslan Ermilov
726b61ab5f
Avoid use of direct troff requests in mdoc(7) manual pages.
2000-11-10 17:46:15 +00:00
Philippe Charnier
dbb9d8f826
Add DIAGNOSTICS section name
2000-03-26 15:06:46 +00:00
Peter Wemm
c3aac50f28
$Id$ -> $FreeBSD$
1999-08-28 01:08:13 +00:00
Nik Clayton
3be5f1f5ce
Add $Id$, to make it simpler for members of the translation teams to
...
track.
The $Id$ line is normally at the bottom of the main comment block in the
man page, separated from the rest of the manpage by an empty comment,
like so;
.\" $Id$
.\"
If the immediately preceding comment is a @(#) format ID marker than the
the $Id$ will line up underneath it with no intervening blank lines.
Otherwise, an additional blank line is inserted.
Approved by: bde
1999-07-12 20:24:20 +00:00