When matching a regex with ^, it would attempt to access
gototab[NSTATES][NCHARS+2], and therefore access the state for the \002
character instead. This change is required to run awk under CHERI (with
sub-object bounds) and when running with UBSan instrumentation.
This was committed upstream as cbf924342b
Found by: CHERI (with subobject bounds enabled)
Obtained from: CheriBSD
Reviewed By: imp
Differential Revision: https://reviews.freebsd.org/D26509
Note: this backs out a number of changes we've made to awk because
they aren't upstream, but are on the vendor branch. Those will be
reapplied. svn makes it needlessly difficult to know which ones, but
at least r315426, r301289, and maybe r301691, though there may be
others too. None of these are critical, so bisecting through this
point is safe for all but awk regression tests :).
$ echo x | awk '/[[:cntrl:]]/'
x
The NUL character in cntrl class truncates the pattern, and an empty
pattern matches anything. The patch skips NUL as a quick fix.
PR: 195792
Submitted by: kdrakehp@zoho.com
Approved by: bwk@cs.princeton.edu (the author)
MFC after: 3 days
Instead of changing the whole course to another POSIX-permitted way
for consistency and uniformity I decide to completely ignore missing
regex fucntionality and focus on fixing bugs in what we have now,
too many small obstacles we have choicing other way, counting ports.
Corresponding libc changes are backed out in r302824.
I'll try to keep the change very minimal to not touch contribed code much.
I'll send it upstream when it will be merged to main branches,
but we need the change right now here.
- Make one-true-awk respect locale's collating order in [a-z]
bracket expressions, until a more complete fix (like handing
BREs) is ready.
- Don't require a space between -[fv] and its argument.