I'm starting with the easy cases. The leftovers need to be looked at a
bit more closely.
Note that this change _does_ modify the code of the old tests. This is
required in order to allow the code to locate the data files in the
source directory instead of the current directory, because Kyua
automatically changes the latter to a temporary directory.
Also note that at least one test is known to be broken here. Actually,
the test is not really broken: it's marked as a TODO but unfortunately
Kyua's TAP parser currently does not understand that. Will have to be
fixed separately.
The index() and rindex() functions were marked LEGACY in the 2001
revision of POSIX and were subsequently removed from the 2008 revision.
The strchr() and strrchr() functions are part of the C standard.
This makes the source code a lot more consistent, as most of these C
files also call into other str*() routines. In fact, about a dozen
already perform strchr() calls.
is in accordance with the information provided at
ftp://ftp.cs.berkeley.edu/pub/4bsd/README.Impt.License.Change
Also add $FreeBSD$ to a few files to keep svn happy.
Discussed with: imp, rwatson
A closing bracket immediately after '[=' should not be treated as special.
Different from the submitted patch, a string ending with '[=' does not cause
access beyond the terminating '\0'.
PR: bin/150384
Submitted by: Richard Lowe
MFC after: 2 weeks
- Mention that some of them are POSIX extensions. [2]
PR: docs/85062 [1]
Submitted by: Toby Peterson [1]
Obtained from: wctype(3) [2]
MFC after: 3 days
check to see that a given digit is actually an octal digit. This leads to
unusual consequences if passed in values like \9.
Reported by: Joseph Davison (OpenDarwin project)
MFC after: 1 week
data structures that scale better with large character sets, instead of
arrays indexed by character value:
- Sets of characters to delete/squeeze are stored in a new "cset" structure,
which is implemented as a splay tree of extents. This structure has the
ability to store character classes (ala wctype(3)), but this is not
currently fully utilized.
- Mappings between characters are stored in a new "cmap" structure, which
is also a splay tree.
- The parser no longer builds arrays containing all the characters in a
particular class; instead, next() determines them on-the-fly using
nextwctype(3).
tr -[cC]s '[:upper:]' '[:lower:]'
case (or vice versa):
chars taken from s2 can be different this time
due to lack of complex upper/lower processing,
so fill string2 again to not miss some.
1st one is relatively minor: according our own manpage, upper and lower
classes must be sorted, but currently not.
2nd one is serious:
tr '[:lower:]' '[:upper:]'
(and vice versa) currently works only if upper and lower classes
have exact the same number of elements. When it is not true, like for
many ISO8859-x locales which have bigger amount of lowercase letters,
tr may do nasty things.
See this page
http://www.opengroup.org/onlinepubs/007908799/xcu/tr.html
for detailed description of desired tr behaviour in such cases.
Add some constness to avoid some warnings.
Remove use register keyword.
Deal with missing/unneeded extern/prototypes.
Some minor type changes/casts to avoid warnings.
Reviewed by: md5