135 Commits

Author SHA1 Message Date
ache
c80f41274a 1) Eliminate possibility to call __*collate_range_cmp() with inclomplete
locale (which cause core dump) by removing whole 'table' argument
by which it passed.

2) Restore __collate_range_cmp() in __sccl().

3) Collating [a-z] range in regcomp() only for single bytes locales
(we can't do it now for other ones). In previous state only first 256
wchars are considered and all others are just silently dropped from the
range.
2016-07-14 09:07:25 +00:00
ache
34a9f4379b Back out non-collating [a-z] ranges.
Instead of changing whole course to another POSIX-permitted way
for consistency and uniformity I decide to completely ignore missing
regex fucntionality and concentrace on fixing bugs in what we have now,
too many small obstacles instead, counting ports.
2016-07-14 08:18:12 +00:00
ache
ea21df9888 Remove broken support for collation in [a-z] type ranges.
Only first 256 wide chars are considered currently, all other are just
dropped from the range. Proper implementation require reverse tables
database lookup, since objects are really big as max UTF-8 (1114112
code points), so just the same scanning as it was for 256 chars will
slow things down.

POSIX does not require collation for [a-z] type ranges and does not
prohibit it for non-POSIX locales. POSIX require collation for ranges
only for POSIX (or C) locale which is equal to ASCII and binary for
other chars, so we already have it.

No other *BSD implements collation for [a-z] type ranges.

Restore ABI compatibility with unused now __collate_range_cmp() which
is visible from outside (will be removed later).
2016-07-10 03:49:38 +00:00
pfg
0366f1b527 libc/locale: Fix type breakage in __collate_range_cmp().
When collation support was brought in, the second and third
arguments in __collate_range_cmp() were changed from int to
wchar_t, breaking the ABI. Change them to a "char" type which
makes more sense and keeps the ABI compatible.

Also introduce __wcollate_range_cmp() which does work with wide
characters. This function is used only internally in libc so
we don't export it. Use the new function in glob(3), fnmatch(3),
and regexec(3).

PR:		179721
Suggested by:	ache. jilles
MFC after:	3 weeks (perhaps partial only)
2016-06-05 19:12:52 +00:00
pfg
bcd6e4b333 libc: regexec(3) adjustment.
Change the behavior of when REG_STARTEND is combined with REG_NOTBOL.

From the original posting[1]:

"Enable the assumption that pmatch[0].rm_so is a continuation offset
to  a string and allows us to do a proper assessment of the character
in  regards to it's word position ('^' or '\<'), without risking going
into unallocated memory."

This change makes us similar to how glibc handles REG_STARTEND |
REG_NOTBOL, and is closely related to a soon-to-land fix to sed.

Special thanks to Martijn van Duren and Ingo Schwarze for working
out some consistent behaviour.

Differential Revision:	https://reviews.freebsd.org/D6257
Taken from:	openbsd-tech 2016-05-24 [1]  (Martijn van Duren)
Relnotes:	yes
MFC after:	1 month
2016-05-25 15:35:23 +00:00
pfg
27a3170907 libc/regex: fix two buffer underruns.
Fix some rather complex regex issues found on OpenBSD as part of some
ongoing work to fix a sed(1) bug.

Curiously the OpenBSD tests don't trigger segfaults on FreeBSD but the
bugs were confirmed by running a port of FreeBSD's regex under OpenBSD's
malloc. Huge thanks to Ingo for confirming the behavior.

Taken from:	Ingo Schwarze (through openbsd-tech 2016-05-15)
MFC after:	1 week
2016-05-21 19:54:10 +00:00
pfg
69669cbe99 libc: spelling fixes.
Mostly on comments.
2016-04-30 01:24:24 +00:00
pfg
f81f18e8d3 regex: prevent two improbable signed integer overflows.
In matcher() we used an integer to index nsub of type size_t.
In print() we used an integer to index nstates of type sopno,
typedef'd long.
In both cases the indexes never take negative values.

Match the types to avoid any error.

MFC after:	5 days
2016-04-23 20:45:09 +00:00
ngie
d5e50bf50f Add -static to CFLAGS to unbreak the tests by using a libc.a with
the xlocale private symbols exposed which aren't exposed publicly
via the DSO

PR: 191354
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
2015-12-13 06:33:52 +00:00
ngie
912c9a615d Fix -Wformat issues and minor whitespace issues in surrounding areas
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
2015-12-05 02:25:20 +00:00
ngie
9e66d95c5d split.ih:
- Create automatically generated include header for split.c

main.c:
- Use function definitions from debug.ih and split.ih instead of externs

Sponsored by: EMC / Isilon Storage Division
2015-12-05 02:23:44 +00:00
ngie
06d2ea6728 Use == instead of = in the function comment above split(..) so mkh -p
exposes split(..).

MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
2015-12-05 02:18:36 +00:00
ngie
ac2d4b2c97 Use ANSI C function prototypes/definitions instead of K&R style ones
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
2015-12-05 02:07:55 +00:00
ngie
1a9e649e9e Add missing headers and sort #includes per style(9)
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
2015-12-05 01:19:35 +00:00
ngie
ae11a85edd - Use ANSI C function prototypes/definitions instead of K&R style ones
- Add a missing return type for main(..)

MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
2015-12-05 01:13:18 +00:00
ngie
d6f598ef53 Fix -Wformat warnings by using the correct format qualifiers
MFC after: 1 week
Sponsored by: EMC / Isilon Storage Division
2015-12-05 01:12:58 +00:00
bapt
0eea96b3d2 mdoc: rendering fixes 2015-04-26 10:55:39 +00:00
pfg
565e4b83c1 computematchjumps(): fix allocator sizeof operand mismatch.
Mostly cosmetical warning.

Found by:	Clang static analyzer
2015-04-22 17:09:02 +00:00
pfg
9705f06cfe Prevent NULL pointer de-reference.
As a follow up to r279090, if dp hasn't been defined, we
shouldn't attempt to do an optimization here.
2015-02-21 15:02:27 +00:00
pfg
6fa37b8849 regex(3): Fix uninitialized pointer values.
CID:	405582	(also clang static checker)
CID:	1018724
2015-02-20 21:21:38 +00:00
delphij
70c79b42a2 Disallow pattern spaces which would cause intermediate calculations to
overflow size_t.

Obtained from:	DragonFly (2841837793bd095a82f477e9c370cfe6cfb3862c dillon)
Security:	CERT VU#695940
MFC after:	3 days
2015-02-14 00:23:53 +00:00
joel
fb7abcd8fc mdoc: remove EOL whitespace. 2014-12-29 13:50:59 +00:00
delphij
6227a44c22 Plug a memory leak.
Obtained from:	DragonFlyBSD (commit 5119ece)
MFC after:	2 weeks
2014-12-19 06:48:47 +00:00
pfg
1b1577745c regex(3): Add support for \< and \> word delimiters
Solaris and other OSs have support for \< and \> as word
delimiters in utilities like sed(1). These are useful to
have for general compatiblity with Solaris but should be
avoided for portability with other systems, including the
traditional BSDs.

Bump __FreeBSD_version as this is likely to affect some
userland utilities.

Reference:
https://www.illumos.org/issues/516

PR:		bin/153257
Obtained from:	Illumos
MFC after:	1 month
2014-06-30 20:54:25 +00:00
pfg
84ca9a6378 Revert r267675:
The code doesn't really benefit of using reallocf() in this case.
Also, the realloc() results being assigned temporary variable which
makes blind replacement with reallocf() mostly useless.

Pointed out by:		stefanf, bde
2014-06-21 01:43:56 +00:00
pfg
76a4926d96 regex: Make use of reallocf().
Use of reallocf is useful in libraries as we are not certain the
application will exit after NULL.

This somewhat reduces portability but if since you are building
this as part of libc it is likely you have our non-standard
reallocf(3) already.

Reviewed by:	ache
MFC after:	5 days
2014-06-20 15:29:09 +00:00
pfg
8cec0590bb Revert r265367:
Use of calloc instead of malloc in regex (from OpenBSD).

In this case the change makes no sense since we are using realloc() later.

Reported by:	ache
2014-05-05 18:04:57 +00:00
pfg
c38af089e6 regex: Use calloc instead of malloc.
Mostly to reduce differences with OpenBSD.

Obtained from:	OpenBSD (CVS rev. 1.17)
MFC after:	3 days
2014-05-05 16:41:15 +00:00
pfg
19952369c8 regex: Remove some unreachable breaks.
This is based on a much bigger cleanup done in Illumos.

Reference:
https://www.illumos.org/issues/2077

MFC after:	1 week
2014-05-01 23:34:14 +00:00
marcel
99c9726a00 Replace use of ${.CURDIR} by ${LIBC_SRCTOP} and define ${LIBC_SRCTOP}
if not already defined. This allows building libc from outside of
lib/libc using a reach-over makefile.

A typical use-case is to build a standard ILP32 version and a COMPAT32
version in a single iteration by building the COMPAT32 version using a
reach-over makefile.

Obtained from:	Juniper Networks, Inc.
2014-03-04 02:19:39 +00:00
delphij
3e4a1731aa Fix assignment of maximum bounadary.
Submitted by:	Sascha Wildner <saw online de>
Obtained from:	DragonFly rev fd39c81ba220f7ad6e4dc9b30d45e828cf58a1ad
MFC after:	2 weeks
2013-03-01 23:26:13 +00:00
theraven
1c33114ffd Remove some duplicated copyright notices.
Approved by:	dim (mentor)
2012-03-06 12:53:44 +00:00
theraven
0f6ef690b3 Implement xlocale APIs from Darwin, mainly for use by libc++. This adds a
load of _l suffixed versions of various standard library functions that use
the global locale, making them take an explicit locale parameter.  Also
adds support for per-thread locales.  This work was funded by the FreeBSD
Foundation.

Please test any code you have that uses the C standard locale functions!

Reviewed by:    das (gdtoa changes)
Approved by:    dim (mentor)
2011-11-20 14:45:42 +00:00
kevlo
85b2830346 Converting int to wint_t leads to broekn comparison of raw char
and encoded wint_t.

Spotted by:	ache
2011-11-11 01:35:07 +00:00
kevlo
38e063bea1 - Don't handle out-of-memory condition
- Fix types of function arguments match their declaration

Reviewed by:	delphij
Obtained from:	NetBSD
2011-11-10 01:44:05 +00:00
uqs
8ae3afcfad mdoc: drop redundant .Pp and .LP calls
They have no effect when coming in pairs, or before .Bl/.Bd
2010-10-08 12:40:16 +00:00
dds
510264aac8 Fix an off-by-one error in the marking of the O_CH operator
following an OOR2 operator.

PR:		130504
MFC after:	2 weeks
2009-09-16 06:32:23 +00:00
dds
961cacc4b4 Add a couple of debugging statements. 2009-09-16 06:29:23 +00:00
dds
d865692bb0 Add two test cases from PR 130504.
An additional one coming from http://www.research.att.com/~gsf/testregex/
was not added; at some point the entire AT&T regression test harness
should be imported here.
But that would also mean commitment to fix the uncovered errors.

PR:		130504
Submitted by:	Chris Kuklewicz
2009-09-15 21:15:29 +00:00
keramida
8ba042b0de Add two example regexps: (1) one for matching all the characters
that belong in a character class, and (2) one for matching all
the characters *not* in a character class.

Submitted by:	Mark B, mkbucc at gmail.com
MFC after:	3 days
2008-09-05 17:41:20 +00:00
kevlo
c74ac9adc1 getopt(3) returns -1, not EOF. 2008-02-18 03:19:25 +00:00
delphij
f848dcc6bb Diff reduction against other *BSDs: ANSIfy function
prototypes.  No function changes.
2007-06-11 03:05:54 +00:00
delphij
0debc89a00 Const'ify and ANSIfy the internal interfaces of regex(3).
This is the final change that makes libc to compile with
WERROR on my amd64 crashbox.
2007-05-25 12:44:58 +00:00
deischen
2a7306fdc5 Use C comments since we now preprocess these files with CPP. 2007-04-29 14:05:22 +00:00
delphij
b444fd4080 Test cases for back references.
Obtained from:	OpenBSD
2007-03-05 09:44:41 +00:00
delphij
366513d474 Only stop evaluation of a back reference if the match length is
zero and the recursion level is too deep.

Obtained from:	OpenBSD
2007-03-05 09:43:55 +00:00
delphij
01efaf95b8 Avoid infinite recursion on:
echo "foo foo bar bar bar baz" | sed 's/\([^ ]*\)\( *\1\)*/\1/g'

Obtained from:	OpenBSD via NetBSD (rev. 1.18)
2007-03-05 03:07:36 +00:00
imp
cd1f140ae4 Per Regents of the University of Calfornia letter, remove advertising
clause.

# If I've done so improperly on a file, please let me know.
2007-01-09 00:28:16 +00:00
deischen
a0f6b0f1d0 Add each directory's symbol map file to SYM_MAPS. 2006-03-13 01:15:01 +00:00
deischen
138dd54357 Add symbol maps and initial symbol version definitions to libc.
Reviewed by:	davidxu
2006-03-13 00:53:21 +00:00