I'm starting with the easy cases. The leftovers need to be looked at a
bit more closely.
Note that this change _does_ modify the code of the old tests. This is
required in order to allow the code to locate the data files in the
source directory instead of the current directory, because Kyua
automatically changes the latter to a temporary directory.
Also note that at least one test is known to be broken here. Actually,
the test is not really broken: it's marked as a TODO but unfortunately
Kyua's TAP parser currently does not understand that. Will have to be
fixed separately.
data structures that scale better with large character sets, instead of
arrays indexed by character value:
- Sets of characters to delete/squeeze are stored in a new "cset" structure,
which is implemented as a splay tree of extents. This structure has the
ability to store character classes (ala wctype(3)), but this is not
currently fully utilized.
- Mappings between characters are stored in a new "cmap" structure, which
is also a splay tree.
- The parser no longer builds arrays containing all the characters in a
particular class; instead, next() determines them on-the-fly using
nextwctype(3).