Instead of changing the whole course to another POSIX-permitted way
for consistency and uniformity I decide to completely ignore missing
regex fucntionality and focus on fixing bugs in what we have now,
too many small obstacles we have choicing other way, counting ports.
Corresponding libc changes are backed out in r302824.
Instead of trying to expand whole range at regcomp() stage as we do,
GNU regex allocates separate ranges [start,end] set each character
is checked against, so collation is possible and turned on for ranges here.
When something like that will be implemented or our obsoleted regex code
will be replaced to something like TRE, and in case we decide to use
collation in [a-z] ranges, all changes related to r302512 can be backed out,
but now we need consistency.
Change several int variables to size_t, ssize_t, or ptrdiff_t.
This should fix the bug described in CVE-2012-5667 when an input
line is so long that its length cannot be stored in an int
variable.
This is based on NetBSD's revision which says:
This change to NetBSD's version of GNU grep 2.5.1 (licenced under
GPLv2) was made without direct reference to any code licenced
under GPLv3.
Obtained from: NetBSD
MFC after: 3 days
character representation of input data across calls to dfaexec(), and by
caching the lengths of character across calls to check_multibyte_string().
Obtained from: Fedora (Tim Waugh)
was lost). Restore original version to try and avoid breaking the build
while David O'brien does a proper set of imports and merges.
Requested by: obrien