freebsd-skq/lib/libc/regex
Miod Vallat d36b5dbe28 libc: regex: rework unsafe pointer arithmetic
regcomp.c uses the "start + count < end" idiom to check that there are
"count" bytes available in an array of char "start" and "end" both point to.

This is fine, unless "start + count" goes beyond the last element of the
array. In this case, pedantic interpretation of the C standard makes the
comparison of such a pointer against "end" undefined, and optimizers from
hell will happily remove as much code as possible because of this.

An example of this occurs in regcomp.c's bothcases(), which defines
bracket[3], sets "next" to "bracket" and "end" to "bracket + 2". Then it
invokes p_bracket(), which starts with "if (p->next + 5 < p->end)"...

Because bothcases() and p_bracket() are static functions in regcomp.c, there
is a real risk of miscompilation if aggressive inlining happens.

The following diff rewrites the "start + count < end" constructs into "end -
start > count". Assuming "end" and "start" are always pointing in the array
(such as "bracket[3]" above), "end - start" is well-defined and can be
compared without trouble.

As a bonus, MORE2() implies MORE() therefore SEETWO() can be simplified a
bit.

PR:		252403
2021-01-08 13:58:35 -06:00
..
grot
cname.h
COPYRIGHT
engine.c libregex: implement \b and \B (word boundary, not word boundary) 2020-12-05 03:16:05 +00:00
Makefile.inc Add libregex, connect it to the build 2018-01-22 02:44:41 +00:00
re_format.7
regcomp.c libc: regex: rework unsafe pointer arithmetic 2021-01-08 13:58:35 -06:00
regerror.c
regex2.h libc: regex: retire internal EMPTBR ("Empty branch present") 2020-12-05 03:18:48 +00:00
regex.3 regex(3): belatedly document REG_POSIX from r363734 2020-08-04 02:06:49 +00:00
regexec.c
regfree.c
Symbol.map regex(3): Interpret many escaped ordinary characters as EESCAPE 2020-07-29 23:21:56 +00:00
utils.h regcomp: reduce size of bitmap for multibyte locales 2018-12-12 04:23:00 +00:00
WHATSNEW