Commit Graph

33 Commits

Author SHA1 Message Date
Andrey A. Chernov
5b4fa425ba Back out non-collating [a-z] ranges (r302594).
Instead of changing the whole course to another POSIX-permitted way
for consistency and uniformity I decide to completely ignore missing
regex fucntionality and focus on fixing bugs in what we have now,
too many small obstacles we have choicing other way, counting ports.
Corresponding libc changes are backed out in r302824.
2016-07-14 09:19:53 +00:00
Andrey A. Chernov
1ef4039ac7 1) Following r302512 (remove collation support for [a-z]-ranges in libc)
remove collation support for a-z ranges here too.
It was implemented for single byte locales only in any case.

2) Reduce [Cc]flag loop to WCHAR_MAX, WINT_MAX here includes WEOF which is
not a character.

3) Optimize [Cc]flag case: don't repeatedly add the last character of
string2 to squeeze cset when string2 reach its EOS state.

4) Reflect in the manpage that [=equiv=] is implemented for single
byte locales only.
2016-07-11 21:23:50 +00:00
Ed Schouten
b3608ae18f Replace index() and rindex() calls with strchr() and strrchr().
The index() and rindex() functions were marked LEGACY in the 2001
revision of POSIX and were subsequently removed from the 2008 revision.
The strchr() and strrchr() functions are part of the C standard.

This makes the source code a lot more consistent, as most of these C
files also call into other str*() routines. In fact, about a dozen
already perform strchr() calls.
2012-01-03 18:51:58 +00:00
Ed Schouten
b173dd440b Build tr(1) with WARNS=6. 2011-10-14 07:25:20 +00:00
Joel Dahl
da52b4caaf Remove the advertising clause from UCB copyrighted files in usr.bin. This
is in accordance with the information provided at
ftp://ftp.cs.berkeley.edu/pub/4bsd/README.Impt.License.Change

Also add $FreeBSD$ to a few files to keep svn happy.

Discussed with:	imp, rwatson
2010-12-11 08:32:16 +00:00
Jilles Tjoelker
88deede54a tr: Fix '[=]=]' equivalence class.
A closing bracket immediately after '[=' should not be treated as special.

Different from the submitted patch, a string ending with '[=' does not cause
access beyond the terminating '\0'.

PR:		bin/150384
Submitted by:	Richard Lowe
MFC after:	2 weeks
2010-09-29 22:24:18 +00:00
Xin LI
821df508e8 Revert most part of 200420 as requested, as more review and polish is
needed.
2009-12-13 03:14:06 +00:00
Xin LI
6f2d322192 Remove unneeded header includes from usr.bin/ except contributed code.
Tested with:	make universe
2009-12-11 23:35:38 +00:00
Maxim Konovalov
ba5b74d001 o Remove duplicate includes.
Obtained from:	Slava Semushin via NetBSD
2007-01-20 08:24:02 +00:00
Jordan K. Hubbard
03afb27c83 tr(1) attempts to convert \n[n][n] sequences into octal digits, but doesn't
check to see that a given digit is actually an octal digit.  This leads to
unusual consequences if passed in values like \9.

Reported by:	Joseph Davison (OpenDarwin project)
MFC after:	1 week
2004-11-14 05:15:25 +00:00
Tim J. Robbins
ca99cfdd14 Add support for multibyte characters. The challenge here was to use
data structures that scale better with large character sets, instead of
arrays indexed by character value:
- Sets of characters to delete/squeeze are stored in a new "cset" structure,
which is implemented as a splay tree of extents. This structure has the
ability to store character classes (ala wctype(3)), but this is not
currently fully utilized.
- Mappings between characters are stored in a new "cmap" structure, which
is also a splay tree.
- The parser no longer builds arrays containing all the characters in a
particular class; instead, next() determines them on-the-fly using
nextwctype(3).
2004-07-09 02:08:07 +00:00
Andrey A. Chernov
035944c3b6 Back out [:upper:] and [:lower:] classes sorting, it is not required
by POSIX and gains nothing with current code.
2003-08-05 07:59:46 +00:00
Andrey A. Chernov
30c1156451 No functional changes, just code reorganization from prev. commit, it
makes one malloc unneeded, removes two bzero's and makes code more readable.

"Bright ideas comes only _after_ commits."
2003-08-04 05:22:06 +00:00
Andrey A. Chernov
21f53e9138 POSIX require complex processing of 'c-c' ranges: if one of the endpoints
is octal sequence, range is taken in the byte values order, for non-octal
endpoints range is taken in the sorted collation order.

Implement it.
2003-08-04 04:20:04 +00:00
Andrey A. Chernov
e42eb6838e 1) Fix -C - it was broken since introduced, wrong array sorted
2) Fix last (repeated) char after [:class:], it was \0 in original code
2003-08-03 22:02:49 +00:00
Andrey A. Chernov
a508a04d43 POSIX requires 'c-c' must conform collate and be in collation order 2003-08-03 03:51:27 +00:00
Andrey A. Chernov
00611f0457 This patch address two problems.
1st one is relatively minor: according our own manpage, upper and lower
classes must be sorted, but currently not.

2nd one is serious:
	tr '[:lower:]' '[:upper:]'
	(and vice versa) currently works only if upper and lower classes
	have exact the same number of elements. When it is not true, like for
	many ISO8859-x locales which have bigger amount of lowercase letters,
	tr may do nasty things.

	See this page
	http://www.opengroup.org/onlinepubs/007908799/xcu/tr.html
	for detailed description of desired tr behaviour in such cases.
2003-08-03 02:23:39 +00:00
Tim J. Robbins
7dd4ac68f1 Use err instead of errx when malloc fails. "malloc" is not a helpful
error message.
2002-07-05 09:28:13 +00:00
Tim J. Robbins
232a0ff51d Improve parsing of character and equivalence classes:
[:*] and [=*] are parsed as `infinitely many repetitions of :' (or *)
instead of literal characters (SUSv3)
2002-06-15 07:38:27 +00:00
Tim J. Robbins
e73c3d279c Don't treat the trailing ']' of an equivalence class expression as a
character in the set. tr -d '[=a=]' was deleting ]'s as well as a's.
Noticed by the textutils test suite.
2002-06-14 09:53:11 +00:00
Tim J. Robbins
85f6c317ea Implement support for equivalence classes ([=e=]) when the mapping is
one-to-one (SUSv3)
2002-06-14 07:37:08 +00:00
Warner Losh
3f330d7d1a remove __P 2002-03-22 01:42:45 +00:00
Mark Murray
787324755c WARNS=2 fixes, use __FBSDID(), kill register keyword. 2001-12-11 23:36:25 +00:00
Peter Wemm
c3aac50f28 $Id$ -> $FreeBSD$ 1999-08-28 01:08:13 +00:00
Philippe Charnier
af647767ed Use err(3) instead of local redefinition. Cosmetic in usage(). 1997-08-18 07:24:58 +00:00
Joerg Wunsch
3281f9d8e5 Cast char's to (u_char) before passing them to isctype() functions. 1996-03-19 21:21:06 +00:00
Joerg Wunsch
6109ca90ae Fix a couple of sign-extension bugs.
Submitted by:	serg@bcs1.bcs.zaporizhzhe.ua (Sergey Shkonda)
1996-03-17 09:00:48 +00:00
Bruce Evans
67a8a10b9c Updated to BSD4.4lite2. Fixes PR836. `echo abcd | tr a-d A-BC-D' now
works.
1995-11-28 13:18:47 +00:00
Andrey A. Chernov
199482ead7 Fix broken charclass handling
Add setlocale LC_CTYPE
1995-10-28 22:27:03 +00:00
Poul-Henning Kamp
b8ab406ba1 Remove declamations which <ctype.h> already does for us. 1995-10-21 22:02:10 +00:00
Poul-Henning Kamp
37782cea67 Added #include <ctype.h> 1995-10-21 21:08:43 +00:00
Andrey A. Chernov
cfb92d36fb Fix print class mistype 1994-10-28 23:31:48 +00:00
Rodney W. Grimes
9b50d90275 BSD 4.4 Lite Usr.bin Sources 1994-05-27 12:33:43 +00:00