Import GNU grep 2.5.1 (trimmed)

This commit is contained in:
Tim J. Robbins 2004-07-04 09:52:08 +00:00
parent 7a39f4da90
commit 6fdbbb5487
Notes: svn2git 2020-12-20 02:59:44 +00:00
svn path=/vendor/misc-GNU/dist1/; revision=131554
54 changed files with 8286 additions and 1125 deletions

View File

@ -38,4 +38,7 @@ it came straight from gawk-3.0.3 with small editing and fixes.
Many folks contributed see THANKS, if I omited someone please
send me email.
Alain Magloire is the current maintainer.
Alain Magloire maintained GNU grep until version 2.5e.
Bernhard "Bero" Rosenkränzer <bero@redhat.com> is the current maintainer.

View File

@ -2,7 +2,7 @@
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
@ -291,7 +291,7 @@ convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
Copyright (C) 19yy <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
@ -313,7 +313,7 @@ Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) year name of author
Gnomovision version 69, Copyright (C) 19yy name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.

View File

@ -1,3 +1,983 @@
2002-03-26 Bernhard Rosenkraenzer <bero@redhat.com>
* src/grep.c: Don't fail if we don't have an stdout fd and -q
is used (happens e.g. on calls from hotplug scripts)
* src/grep.c: Don't hang forever if fed with an empty string to
grep for and --color enabled
* src/grep.c: Fix infinite loop on
echo "1 one" | grep -E "[0-9]*" -o
echo "1 one" | grep -E "[0-9]*" --color
* po/*: Sync wiith translation project
* src/grep.c, src/Makefile.am, configure.in: Add patch from
Paul Eggert <eggert@twinsun.com> to comply with ridiculous
guidelines (don't act differently if invoked as egrep or fgrep)
* configure.in: Bump version number, require a recent autoconf
2002-03-14 Bernhard Rosenkraenzer <bero@redhat.com>
* src/Makefile.am, po/Makefile.in.in: Support DESTDIR properly
* tests/bre.tests: Add fix from
Peter Breitenlohner <peb@mppmu.mpg.de>
2002-03-13 Bernhard Rosenkraenzer <bero@redhat.com>
* configure.in, m4/regex.m4, m4/malloc.m4, m4/realloc.m4:
Don't set LIBOBJS directly, autoconf 2.53 doesn't like it
* intl/*: Sync with gettext 0.11
* po/*: Sync with translation project
* configure.in, src/Makefile.am: Don't duplicate code - make
egrep and fgrep links to grep and set matcher based on
application name, suggestion from
Guillaume Cottenceau <gc@mandrakesoft.com>
* src/grep.c: (prline) Add fix for -i --color from
Jim Meyering <meyering@lucent.com>
* configure.in: Version 2.5; release
2002-01-23 Bernhard Rosenkraenzer <bero@redhat.com>
* configure.in: Version 2.5g
* Makefile.cvs, grep.spec: Add packaging tools
Merge djgpp changes from Andrew Cottrell <anddjgpp@ihug.coml.au>:
* src/grep.c: Added conditional compilation for DJGPP
* djgpp: remove directory as it is no longer required with DJGPP 2.03
(or 2.04 when released)
* README.DOS: Moved djgpp/readme to readme.dos
* PATCHES.AC, PATCHES.AM: delete files - redundant
* configure.in, Makefile.am: remove djgpp directory from list
2002-01-22 Bernhard Rosenkraenzer <bero@redhat.com>
* doc/grep.texi, doc/grep.1, NEWS: Document --label
* po/ru.po: Sync with translation project
* po/grep.pot: Sync with source
2002-01-18 Bernhard Rosenkraenzer <bero@redhat.com>
* src/grep.c: Add --label, based on patch from Stepan Koltsov
2001-11-20 Bernhard Rosenkraenzer <bero@redhat.com>
* autogen.sh: Don't hardcode aclocal dir
2001-11-19 Bernhard Rosenkraenzer <bero@redhat.com>
* src/grep.c: Add --only-matching (-o) switch (see NEWS)
* doc/grep.texi, doc/grep.1, NEWS: Document changes
* configure.in, lib/Makefile.am: Don't use internal getopt if
we're on a system that provides a working getopt function
2001-09-25 Bernhard Rosenkraenzer <bero@redhat.com>
* configure.in: Detect pcre correctly even when it's in
non-standard locations, using pcre-config
* src/grep.c: Add --color={always,never,tty} argument (like in ls)
* src/grep.c: Turn off blinking in the default colorization
* src/grep.c: Add --devices (-D) switch (analogous to --directories)
* src/dfa.c: Fix an i18n bug: echo "A" | grep '[A-Z0-9]' wouldn't work
in non-C-Locales on systems using current versions of glibc.
* AUTHORS: Change maintainer, credit Alain for his work until now
* configure.in, m4/decl.m4, m4/dosfile.m4, m4/gettext.m4,
m4/init.m4, m4/install.m4, m4/largefile.m4, m4/lcmessage.m4,
m4/header.m4, m4/isc-posix.m4, m4/missing.m4, m4/progtest.m4,
m4/sanity.m4:
Fix build with autoconf 2.5x, retain 2.1x compatibility for now
* autogen.sh: Add some crude hacks to make it possible to build with
both autoconf 2.5x and 2.1x
* acconfig.h: removed (no longer required)
* Makefile.am: add cvs-clean target
* doc/grep.texi, doc/grep.1, NEWS: Document changes
(--color, --devices, -D)
* src/dfa.c, src/grep.c: Add vim modelines
2001-08-30 Alain Magloire
* configure.in: Add gl in ALL_LINGUAS.
2001-08-30 Kurt D Schwehr
* doc/grep.1: Warn that grep insert a "--" between groups of matches,
when using the context options.
* doc/grep.texi: Likewised.
2001-08-25 Heikki Korpela
* doc/grep.texi: Point out that some Platforms do not support
reading of directories and silently ignore them.
2001-08-21 Alain Magloire
* lib/malloc.c: New file:
* lib/realloc.c: New file:
* lib/Makefile.am: Add malloc.c and realloc.c in EXTRA_DIST.
2001-07-31 Alain Magloire
* po/*.po: New files from the translation team:
grep-2.5e.de.po grep-2.5e.el.po grep-2.5e.eo.po grep-2.5e.es.po
grep-2.5e.et.po grep-2.5e.fr.po grep-2.5e.gl.po grep-2.5e.it.po
grep-2.5e.pl.po grep-2.5e.sl.po
2001-07-31 Andreas Schwab
* src/grep.c: Fix all uses of error to pass a proper format
string.
2001-07-29 Alain Magloire
* grep/src/grep.c (usage): Typos corrected.
Patches from Santiago Vila.
2001-07-29 Alain Magloire
David Clissold, wrote:
a small bug in the GNU grep 2.4.2, which may have gone unnoticed
because it only causes a failure if building on a system with large
files enabled (e.g. an "off_t" is a "long long" rather than a "long").
savedir() takes on off_t argument, but in grepdir() the parameter
is cast to an (unsigned). Well, if an off_t is larger than an int,
the value gets truncated. This would not normally have an effect on a
little-endian platform (unless the file is >2GB), but on a big-endian
system it will always fail. The external effect is that
"grep -r foo dir_name" fails with ENOMEM (from malloc() within
savedir()).
* grep/src/grep.c (grepdir): Remove the (unsigned) cast when calling
savedir().
Patch from David Clissold.
2001-07-29 Alain Magloire
* grep/doc/grep.texi: In Bugs report use {n,m} for consistency.
* grep/doc/grep.1: Likewised.
Noted by Steven Lucy.
2001-04-27 Isamu Hasegawa
* dfa.c (mblen_buf) : New variable contains the amount of remain
byte of corresponding multibyte character in the input string.
(SKIP_REMAIN_MB_IF_INITIAL_STATE) : Use mblen_buf.
(match_anychar) : Use mblen_buf.
(match_mb_charset) : Use mblen_buf.
(transit_state_consume_1char) : Use mblen_buf.
(transit_state) : Use inputwcs to get current (multibyte) character.
(dfaexec) : Add initialization of mblen_buf.
2001-04-27 Isamu Hasegawa
* dfa.c (addtok) : Set appropriate value to multibyte_prop.
(dfastate) : Add the initialization of the variable.
(dfaexec) : Call transit_state if d->fail may transit by
multibyte characters.
(transit_state_singlebyte) : Clean up unnecessary code.
(transit_state_consume_1char) : Likewise.
(transit_state) : Add checking for word and newline.
2001-04-19 Isamu Hasegawa
* search.c (check_multibyte_string) : Check the case when mbclen == 0.
2001-04-11 Isamu Hasegawa
* search.c (check_multibyte_string) : Check the head of multibyte
characters, and optimize a bit.
(EGexecute) : Optimize a bit.
(Fexecute) : Fix the index.
2001-04-02 Alain Magloire
* lib/regex.c: Update from GNU lib C, with the changes
provided by Paul Eggert.
* lib/posix/regex.h: Likewise.
2001-02-17 Paul Eggert
Stop trying to support hosts that have nonstandard declarations for
mbrtowc and/or mbstate_t. It's not worth the portability hassle.
* lib/quotearg.c (mbrtowc, mbsinit): Remove workaround macros
for hosts that have mbrtowc but not mbstate_t, as we now
insist on proper declarations for both before using mbrtowc.
2001-03-18 Alain Magloire
* configure.in: Call AC_MBSTATE_T.
* Makefile.am: Add mbstate_t.m4
* m4/Makefile.am: Add mbstate_t.m4
* m4/mbstate_t.m4: New m4 macro.
* lib/strtol.c: Define CHAR_BITS.
Uwe H. Steinfeld, Ruslan Ermilov, Volkert Bochert, noted
that mbstate_t was not define for certain platforms.
2001-03-18 Paul Eggert
* src/grep.c (fillbuf): Fix storage allocation performance
bug: buffer was doubling in size in many cases where it didn't
have to.
2001-03-17 Paul Eggert
* src/grep.c (fillbuf): Avoid unnecessary division by 2.
Don't check xrealloc return value; it's guaranteed to be nonzero.
(fillbuf, grepdir): Use xalloc_die rather than error; it's shorter.
2001-03-17 Alain Magloire
* src/grep.c (context_length_arg): error () passing wrong format.
Spotted by Jim Meyering.
2001-03-07 Alain Magloire
* README-alpha: Removed reference to GNU tar, add the location
of the CVSROOT.
2001-03-06 Alain Magloire
Only the Regex patterns should be split in an array, patterns[].
The dfa and KWset compiled patterns should remain global and the
patterns compiled all at once.
* src/search.c: include "error.h" and "xalloc.h" to get prototyping
of x*alloc() and error().
(kwsinit): Reverse to previous behaviour and takes no argument.
(kwsmusts): Likewised.
(Gcompile): For the regex pattern, split them and each pattern
is put in different compiled structure patterns[]. The patterns
are given to dfacomp() and kwsmusts() as is.
(Ecompile): Likewised.
(Fcompile): Reverse to the old behaviour of compiling the enire
patterns in one shot.
(EGexecute): If falling to GNU regex for the matching, loop in the
array of compile patterns[] to find a match.
(error): Many error () were call with arguments in the wrong order.
* tests/file.sh: Simple test to check for pattern in files.
Reaction to bug report fired by Greg Louis <glouis@dynamicro.on.ca>
2001-03-06 Isamu Hasegawa
In multibyte environments, handle multibyte characters as single
characters in bracket expressions.
* src/dfa.h (mb_char_classes) : new structure.
(mbcsets): new variable.
(nmbcsets): new variable.
(mbcsets_alloc) : new variable.
* src/dfa.c (prtok) : handle MBCSET.
(fetch_wc): new function to fetch a wide character.
(parse_bracket_exp_mb) : new function to handle multibyte character
in lex().
(lex): invoke parse_bracket_exp_mb() for multibyte bracket expression.
(atom): handle MBCSET.
(epsclosure): likewise.
(dfaanalyze): likewise.
(dfastate): likewise.
(match_mb_charset): new function to judge whether a bracket match
with a multibyte character.
(check_matching_with_multibyte_ops) : handle MBCSET.
(dfainit): initialize new variables.
(dfafree): free new variables.
2001-03-04 Alain Magloire
To get more in sync with other GNU utilities like GNU tar and fetish
all the supporting functions are now under lib.
Thanks to Jim Meyering, Volkert Bochert and Paul Eggert for
the code and the reminders.
* src/grep.c (fatal): Function removed, using error () from
lib/error.c instead.
(usage): Copyright updated.
(error): Function removed, using error () from lib/error.c instead,
adjust prototypes.
(prog): Global variable rename to program_name, to work with new
lib/error.c.
(xrealloc): Removed using lib/xmalloc.c.
(xmalloc): Removed using lib/xmalloc.c
(main): Register with atexit() to check for error on stdout.
* configure.in: Check for atexit(), call jm_MALLOC, jm_RELLOC and
jm_PREREQ_ERROR.
* tests/bre.awk: Removed the hack to drain the buffer since we
always fclose(stdout) atexit.
* tests/ere.awk: Likewise.
* tests/spencer1.awk: Likewise.
* bootstrap/Makefile.try: Update the Makefile to reflect the changes
in the new hierarchy.
* README-alpha: New File.
* m4/realloc.m4: New File.
* m4/malloc.m4: New File.
* m4/error.m4: New File.
* m4/Makefile.am: Updated.
* lib: New directory.
* lib/Makefile.am: New file.
* lib/closeout.c: New file.
* lib/closeout.h: New file.
* lib/fnmatch.c: New file.
* lib/fnmatch.h: New file.
* lib/atexit.c: New file.
* lib/error.c: New file.
* lib/error.h: New file.
* lib/quotearg.h: New file.
* lib/quotearg.c: New file.
* lib/xmalloc.c: New file.
* lib/posix: New directory.
* lib/posix/Makefile.am: New file.
* src/getopt.c: Moved to lib.
* src/getopt1.c: Moved to lib.
* src/getopt.h: Moved to lib.
* src/alloca.c: Moved to lib.
* src/exclude.c: Moved to lib.
* src/exclude.h: Moved to lib.
* src/hard-locale.h: Moved to lib.
* src/hard-locale.c: Moved to lib.
* src/isdir.c: Moved to lib.
* src/mechr.c: Moved to lib.
* src/obstack.c: Moved to lib.
* src/obstack.h: Moved to lib.
* src/regex.c: Moved to lib.
* src/regex.h: Moved to lib.
* src/posix: Moved to lib.
* src/posix/regex.h: Moved to lib.
* src/savedir.h: Moved to lib.
* src/savedir.c: Moved to lib.
* src/stpcpy.c: Moved to lib.
* src/strtoul.c: Moved to lib.
* src/strtol.c: Moved to lib.
* src/strtoull.c: Moved to lib.
* src/strtoumax.c: Moved to lib.
* src/xstrtol.c: Moved to lib.
* src/xstrtol.h: Moved to lib.
* src/xstrtoumax.c: Moved to lib.
2001-03-01 Isamu Hasegawa
Implement the mechanism to match with multibyte characters,
and use it for `period' in multibyte environments.
* dfa.h (mbps): new variable.
* dfa.c (prtok): handle ANYCHAR.
(lex): use ANYCHAR for `period' in multibyte environments.
(atom): handle ANYCHAR.
(state_index): initialize mbps in multibyte environments.
(epsclosure): handle ANYCHAR.
(dfaanalyze): handle ANYCHAR.
(dfastate): handle ANYCHAR.
(realloc_trans_if_necessary): new function.
(transit_state_singlebyte): new function.
(match_anychar): new function.
(check_matching_with_multibyte_ops): new function.
(transit_state_consume_1char): new function.
(transit_state): new function.
(dfaexec): invoke transit_state if expression can match with
a multibyte character in multibyte environments.
(dfamust): handle ANYCHAR.
2001-03-01 Alain Magloire
* src/exclude.c: New file.
* src/exclude.h: New file.
* src/grep.c (main): Took the GNU tar code to handle
the option --include, --exclude, --exclude-from.
Files are check for a match, with exlude_filename ().
New option --exclude-from.
* src/savedir.c: Call exclude_filename() to check for
file pattern exclusion or inclusion.
* configure.in: --disable-pcre rename to --disable-perl-regexp.
2001-02-25 Alain Magloire
* src/dfa.c: Typo corrected.
Noted by Isamu Hasegawa.
* src/savedir.c: Typos corrected.
2001-02-22 Alain Magloire
* src/savedir.c (isdir1): New function, calling isdir with
the correct pathname.
2001-02-19 Isamu Hasegawa
Avoid incorrect state transition in multibyte environments.
* dfa.h (nmultibyte_prop): new variable.
(multibyte_prop): new variable.
* dfa.c (addtok): set inputwcs.
(dfastate): avoid incorrect state transition in multibyte
environments.
(dfaexec): likewise.
(dfainit): init multibyte_prop.
(dfafree): free multibyte_prop.
(inputwcs): new variable.
2001-02-19 Isamu Hasegawa
Handle a multibyte character followed by '*', '+', and '{n,m}'
correctly.
* dfa.c (update_mb_len_index): new function.
Support for multibyte string.
(FETCH): call update_mb_len_index.
(lex): check cur_mb_index not to misunderstand multibyte characters.
(atom): make a tree from a multibyte character.
(dfaparse): initialize new variables.
(mbs): new variable.
(cur_mb_len): new variable.
(cur_mb_index): new variable.
2001-02-18 Jim Meyering
* m4/dosfile.m4 (AC_DOSFILE): Move AC_DEFINEs out of AC_CACHE_CHECK.
2001-02-17 Alain Malgoire
* doc/grep.texi: Document the new options and the new behaviour
back-references are local. Use excerpt from Karl Berry regex
texinfo.
* bootstrap/Makefile.try: Added xstrtoumax.o xstrtoul.o hard-local.o
2001-02-17 Alain Magloire
From Guglielmo 'bond' Bondioni :
The bug was that using a multi line file that contained REs (one per
line), backreferences in the REs were considered global (to the file)
and not local (to the line).
That is, \1 in line n refers to the first \(.\) in the whole file,
rather than in the line itself.
From Tapani Tarvainen :
# Re: grep -e '\(a\)\1' -e '\(b\)\1'
That's not the way it should work: multiple -e arguments
should be treated as independent patterns and back references
should not refer to previous ones.
From Paul Eggert :
GNU grep currently does not issue
diagnostics for the following two cases, both of which are erroneous:
grep -e '[' -e ']'
grep '[
]'
POSIX requires a diagnostic in both cases because '[' is not a valid
regular expression.
To overcome those problems, grep no longer pass the concatenate
patterns to GNU regex but rather compile each patterns separately
and keep the result in an array.
* src/search.c (patterns): New global variable; a structure array
holding the compiled patterns.
Declare function prototypes to minimize error.
(dfa, kswset, regexbuf, regs): Removed, no longer static globals, but
rather fields in patterns[] structure per motif.
(Fcompile): Alloc an entry in patterns[] to hold the regex.
(Ecompile): Alloc an entry per motif in the patterns[] array.
(Gcompile): Likewise.
(EGexecute): Loop through of array of patterns[] for a match.
2001-02-17 Alain Magloire
From Bernd Strieder :
# tail -f logfile | grep important | do_something_urgent
# tail -f logfile | grep important | do_something_taking_very_long
If grep does full buffering in these cases then the urgent operation
does not happen as it should in the first case, and in the second case
time is lost due to waiting for the buffer to be filled.
This is clearly spoken not grep's fault in the first place, but libc's.
There is a heuristic in libc that make a stream line-buffered only if a
terminal is on the other end. This doesn't take care of the cases where
this connection is somehow indirect.
* src/grep.c (line_buffered): new option variable.
(prline): if line_buffered is set fflush() is call.
(usage): line_buffered new option.
Input from Paul Eggert, doing setvbuf() may not be portable
and breaks grep -z.
2001-02-16 Alain Magloire
Patch from Isamu Hasegawa, for multibyte support.
This patch prevent kwset_matcher from following problems.
For example, in SJIS encoding, one character has the codepoint 0x895c.
So the second byte of the character can match with '\' incorrectly.
And in eucJP encoding, there are the characters whose codepoints are
0xa5b9, 0xa5c8. On the other hand, there is one character whose
codepoint is 0xb9a5. So 0xb9a5 can match with 2nd byte of 0xa5b9
and 1st byte of 0xa5c8.
* configure.in: Add check for mbrtowc.
* src/search.c (check_multibyte_string): new function.
Support for multibyte string.
(EGexecute): call check_multibyte_string when kwset is set.
(Fexecute): call to check_multibyte_string.
(MBS_SUPPORT): new macro.
(MB_CUR_MAX): new macro.
2001-02-16 Alain Magloire
* djgpp/config.bat: Fix for 4dos.com.
* m4/dosfile.m4 (HAVE_DOS_FILE_CONTENTS): Was not set.
Bugs noted and patched by Juan Manuel Guerrero.
2001-02-16 Alain Magloire
A much requested feature, the possibility to select
files when doing recurse :
# find . -name "*.c" | xargs grep main {}
# grep --include=*.c main .
# find . -not -name "*.c" | xargs grep main {}
# grep --exclude=*.c main .
* src/grep.c (short_options): -R equivalent to -r.
(#ifdef) : Fix some inconsistencies in the use of #ifdefs, prefer
#if defined() wen possible.
(long_options): Add --color, --include and exclude.
(Usage): Description of new options.
(color): Rename color variable to color_option.
Removed 'always|never|auto' arguments, not necessary for grep.
(exclude_pattern): new variable, holder for the file pattern.
(include_pattern): new variable, hoder for the file pattern.
* src/savedir.c: Signature change, take two new argmuments.
* doc/grep.texi: Document, new options.
* doc/grep.man: Document, new options.
2001-02-09 Alain Magloire
* src/grep.c (long_options): Added equivalent to -r with -R.
* src/grep.c (usage): added --color and --colour.
Noted with patch from, H.Merijn Brand and Wichert Akkerman.
2001-02-09 Alain Magloire
Patch from Ulrich Drepper to provide hilighting.
* src/grep.c: New option --color.
(color): New static var.
(COLOR_OPTION): new constant.
(grep_color): new static var.
(prline): Now when color is set prline() will call the current matcher
to find the offset of the matching string.
* src/savedir.c: Take advantage of _DIRENT_HAVE_TYPE if supported.
* src/search.c (EGexecute, Fexecute, Pexecute): Take a new argument
when doing exact match for the color hiligting.
2000-09-01 Brian Youmans
* doc/grep.texi: Typo fixes.
2000-08-30 Paul Eggert
* doc/grep.texi (Usage): Talk about what "grep -r hello *.c"
means.
2000-08-20 Paul Eggert
Handle range expressions correctly even when they match
strings with two or more characters.
* src/dfa.h (CRANGE): New enum value. Comment fix.
* src/dfa.c: Include <locale.h> if HAVE_SETLOCALE.
Include "hard-locale.h".
(prtok): Print CRANGE.
(hard_LC_COLLATE): New static var.
(lex): Return CRANGE when parsing a character range in a hard locale.
Don't use strcoll; it's no longer needed and wasn't correct anyway.
Use unsigned rather than token to hold unsigned chars.
(addtok): Comment fix.
(atom): Treat a CRANGE as if it were (.\1), approximately.
(dfaparse): Initialize hard_LC_COLLATE.
* src/Makefile.am (base_sources): Add hard-locale.c, hard-locale.h.
* src/hard-locale.c, src/hard-locale.h: New files, taken from
textutils.
2000-08-20 Paul Eggert
* tests/Makefile.am (TESTS_ENVIRONMENT): Add LC_ALL=C, since
some of the tests assume the C locale.
2000-08-16 Paul Eggert
* src/search.c (Gcompile, Ecompile): -x overrides -w, for
consistency with fgrep. Don't assume that sizes fit in 'int'.
Fix comments to match code.
2000-06-06 Paul Eggert
* src/grep.c (grepdir): Don't look at st_dev when testing for
Mingw32 bug.
2000-06-05 Paul Eggert
Port to Mingw32, based on suggestions from Christian Groessler
<cpg@aladdin.de>.
* src/isdir.c: New file, taken from fileutils.
* src/Makefile.am (base_sources): Add isdir.c.
* src/grep.c (grepfile): Use isdir instead of doing it inline.
(grepdir): Suppress ancestor check if the directory's inode and device
are both zero, as that occurs only on Mingw32 which doesn't support
inode or device.
* src/system.h (isdir): New decl.
(is_EISDIR): Depend on HAVE_DIR_EACCES_BUG, not D_OK.
Use isdir, not access.
2000-06-02 Paul Eggert
Problen noted by Gerald Stoller <gerald_stoller@hotmail.com>
* src/grep.c (main): POSIX.2 says that -q overrides -l, which
in turn overrides the other output options. Fix grep to
behave that way.
2000-05-27 Paul Eggert
Simplify and tune the buffer allocation strategy. Do not reserve a
large save area: reserve only enough bytes to hold the residue, plus
page alignment. Put a newline sentinel before the buffer, for speed
when searching backwards for newline.
* src/grep.c (ubuffer, bufsalloc, PREFERRED_SAVE_FACTOR, page_alloc):
Remove. All uses changed.
(INITIAL_BUFSIZE): New macro.
(reset, fillbuf): Use simpler buffer allocation strategy.
(reset): Check for preposterously large pagesize that would cause
later calculations to overflow.
(fillbuf): Do not resize buffer if there's room at the end for
at least one more page. This greatly increases performance when
reading from non-regular files that contain no newlines.
When growing the buffer, double its size instead of using a
more complicated algorithm.
(prtext, grep): Speed up by relying on the newline sentinel before the
start of the buffer.
(grep): When looking backwards for the last newline in a buffer,
stop when we hit the residue, since it can't contain a newline.
This avoids an O(N**2) algorithm when reading binary data from
a pipe. Use a sentinel to speed up the backward search for newline.
(nlscan): Undo previous change; it wasn't needed and just complicates
and slows down the code a tad.
2000-05-24 Paul Eggert
Handle very large input counts better. Bug noted by Jim Meyering.
* src/grep.c (totalcc, totalnl): Use uintmax_t, not off_t.
(add_count): New function.
(nlscan, prline, grep): Use it to check line and byte count overflows.
(nlscan, grep): Don't keep track of counts when not asked to; this
avoids unnecessary overflow diagnostics.
(print_offset_sep): Now takes args of type uintmax_t and char,
not off_t and int.
2000-05-16 Paul Eggert
Problem reported by Bob Proulx <rwp@hprwp.fc.hp.com>, this patch
is base on his finding, with appropiate corrections.
* src/grep.c (main): Fix bug: -x and -w matched even when no
patterns were specified.
* tests/empty.sh: Test for -x and -w bug in grep 2.4.2.
2000-04-24 Paul Eggert
POSIX.2 conformance fixes: grep -q now exits with status zero
if an input line is selected, even if an error also occurs.
grep -s no longer affects exit status.
* src/grep.c (suppress_errors): Move definition earlier so
that suppressible_error can use it.
(suppressible_error): New function.
(exit_on_match): New var.
(grepbuf): If exit_on_match is nonzero, exit with status zero
immediately.
(grep, grepfile, grepdir): Invoke suppressible_error.
(main): -q sets exit_on_match.
* doc/grep.1, doc/grep.texi, NEWS:
Document -q's behavior as required by POSIX.2.
* tests/status.sh:
Test for -q and -s behavior as conforming to POSIX.2.
2000-04-20 Paul Eggert
* tests/Makefile.am (TESTS_ENVIRONMENT):
Set GREP_OPTIONS to the empty string.
2000-04-20 Paul Eggert
* tests/status.sh: Fix typo: test -b -> test -r.
2000-04-20 Paul Eggert
* src/dfa.c (lex):
Do not assume that [c] is equivalent to [c-c]; this isn't true
if LC_COLLATE specifies that some characters are equivalent.
(setbit_case_fold): New function.
(lex): Use it to simplify the code a bit.
2000-04-17 Paul Eggert
Do CRLF munging only if HAVE_DOS_FILE_CONTENTS, instead of
having it depend on O_BINARY (which leads to incorrect results
on BeOS, VMS, and MacOS).
* bootstrap/Makefile.try (DEFS): Add -DHAVE_DOS_FILE_CONTENTS.
* src/system.h (SET_BINARY): Define only if HAVE_DOS_FILE_CONTENTS.
(O_BINARY): Do not define.
* m4/dosfile.m4: Define HAVE_DOS_FILE_CONTENTS if it appears we're
using DOS.
* src/grep.c (undossify_input, fillbuf, dosbuf.c, prline, main):
Depend on HAVE_DOS_FILE_CONTENTS, not O_BINARY, when handling CRLF
matters.
(grepfile, main): Depend on SET_BINARY, not O_BINARY, when
handling binary files on hosts that care about text versus binary.
2000-04-17 Paul Eggert
* lib/getpagesize.h (getpagesize): Define to B_PAGE_SIZE if
__BEOS__ is defined. Based on a fix by Bruno Haible
<haible@clisp.cons.org>.
2000-04-17 Bruno Haible
* src/system.h [BeOS]: Ignore O_BINARY.
* src/getpagesize.h [BeOS]: Define getpagesize() as B_PAGE_SIZE.
2000-04-10 Paul Eggert
* doc/grep.1, doc/grep.texi, NEWS: -C now requires an operand.
* src/grep.c (short_options, long_options, main, usage): Likewise.
(context_length_arg): Renamed from ck_atoi. Now reports an error
and exits if the number is out of range for a context length.
(get_nondigit_option): New function, which checks for overflow
correctly, and which does not parse nonadjacent strings of digits
into a single number.
(main): Use get_nondigit_option instead of doing the code inline.
With -A, -B, and -C, optarg is now guaranteed to be nonzero.
2000-04-08 Paul Eggert
Now that we know that the input is always terminated by a
newline before the matching algorithms see it, clean up the
matching algorithms so that they no longer need to modify the
input by inserting a sentinel newline, and no longer worry
about running off the end of the buffer due to a missing sentinel.
* src/grep.c (nlscan, prpending, prtext, grepbuf): Do not
worry about running off the end of the input buffer, since
it's now guaranteed to end in the sentinel newline.
* src/search.c (EGexecute, Pexecute): Likewise.
* src/dfa.c (prtok, dfasyntax, dfaparse, copy, merge, state_index,
epsclosure, dfaexec, dfacomp):
Change many instances of "T *" to "T const *", to catch
any inadvertent programming errors made during this conversion.
* src/dfa.h (dfacomp, dfaexec, dfaparse): Likewise.
* src/grep.c (struct stats.parent, long_options, grepdir,
compile, execute, fillbuf, lastnl, lastout, nlscan, prline,
prpending, prtext, grepbuf, grep, grepfile, grepdir): Likewise.
* src/grep.h (struct matcher.compile, struct matcher.execute):
Likewise.
* src/kwset.c (struct kwset.trans, kwsalloc, kwsincr, treefails,
treedelta, hasevery, treenext, bmexec, cwexec, kwsexec): Likewise.
* src/kwset.h (kwsalloc, kwsincr, kwsexec): Likewise.
* src/search.c (kwsmusts, Gcompile, Ecompile, EGexecute, Pcompile,
Pexecute): Likewise.
* src/dfa.c (dfaexec):
Use size_t, not char *, to avoid worrisome casts to convert
char const * to char *.
* src/dfa.h (dfaexec): Likewise.
* src/grep.c (execute): Likewise.
* src/grep.h (execute): Likewise.
* src/kwset.c (bmexec, cwexec, kwsexec): Likewise.
* src/kwset.h (struct kwsmatch.offset, kwsalloc, kwsincr,
kwsexec): Likewise.
* src/search.c (EGexecute, Fexecute, Pexecute): Likewise.
* src/dfa.h (_PTR_T): Depend on defined __STDC__, not __STDC__.
(PARAMS): Depend on PROTOTYPES, not __STDC__.
* src/dfa.c (dfasyntax): Last arg is unsigned char, not int.
* src/dfa.h (dfasyntax): Likewise.
* src/dfa.h (struct dfa): Remove member newlines; no longer needed.
* src/dfa.c (build_state, dfaexec, dfafree): Do not worry
about special newline state.
* src/search.c (matchers): Move definition to end of file, so
that we don't need forward decls.
(lastexact): Remove.
(kwset_exact_matches): New var; subsumes old lastexact var.
All uses changed.
* src/dfa.c (index): Remove macro.
(REALLOC_IF_NECESSARY): Skip unnecessary test.
(tstbit, setbit, clrbit): Declare arg to be unsigned, to help compiler.
(copyset, zeroset, equal): Use C builtin primitives, to help compiler.
(dfaexec): Do not modify input string.
Remove newline parameter; no longer needed.
(comsubs): Use strchr, not index.
* src/grep.h (matchers): Use fixed name size, not pointer (as
there's no need for the extra flexibility). All uses changed.
* src/kwset.h (struct kwsmatch.offset): Renamed from beg, with
change of type to size_t. All uses changed.
* src/grep.c (reset): No longer need kludge for dfaexec. Simplify.
(reset, grepbuf): Adjust to new interface for 'execute'.
(install_matcher): List is now terminated by null compile,
not null name.
Do not invoke setrlimit if that wouldn't change the limit.
* src/dfa.c (xcalloc, xmalloc, xrealloc, prtok, tstbit, setbit,
clrbit, copyset, zeroset, notset, equal, charclass_index,
looking_at, lex, addtok, atom, nsubtoks, copytoks, closure,
branch, regexp, copy, insert, merge, delete, state_index,
build_state, build_state_zero, icatalloc, icpyalloc, istrstr,
ifree, freelist, enlist, comsubs, addlists, inboth):
Remove forward decls; no longer needed.
* src/grep.c (ck_atoi, usage, error, setmatcher,
install_matcher, prepend_args, prepend_default_options,
page_alloc, reset, fillbuf, grepbuf, prtext, prpending, prline,
print_offset_sep, nlscan, grep, grepfile): Likewise.
* src/kwset.c (enqueue, treefails, treedelta, hasevery,
treenext, bmexec, cwexec): Likewise.
* src/search.c (Gcompile, Ecompile, EGexecute, Fcompile, Fexecute,
Pcompile, Pexecute, kwsinit): Likewise.
* src/search.c (Pcompile): Do not assume newly allocated
storage is zeroed.
2000-04-06 Paul Eggert
* doc/grep.1, doc/grep.texi, NEWS: Improve the explanation of
locale-dependent behavior of range expressions. Mention
LC_COLLATE, since this affects range expressions.
2000-03-26 Paul Eggert
* Makefile.am (ACINCLUDE_INPUTS): Add decl.m4, inttypes_h.m4,
uintmax_t.m4, ulonglong.m4, xstrtoumax.m4.
* m4/Makefile.am (EXTRA_DIST): Likewise.
* src/Makefile.am (base_sources):
Add xstrtol.c, xstrtol.h, xstrtoumax.c.
(EXTRA_DIST): Add strtol.c.
* configure.in (jm_AC_TYPE_UINTMAX_T, jm_AC_PREREQ_XSTRTOUMAX,
HAVE_DECL_STRTOUL, HAVE_DECL_STRTOULL): Add.
(AC_REPLACE_FUNCS): Add strtoul.
* src/grep.c: Include xstrtol.h.
(ck_atio): Use xstrtoumax and do proper overflow checking.
(max_count, outleft): Now off_t, not int.
(main): Likewise. Use xstrtoumax to convert max_count from string.
* acconfig.h (HAVE_DECL_STRTOUL, HAVE_DECL_STRTOULL): New #undefs.
(HAVE_STPCPY, ENABLE_NLS, HAVE_CATGETS, HAVE_GETTEXT,
HAVE_LC_MESSAGES): Remove.
* m4/decl.m4, m4/inttypes_h.m4, m4/uintmax_t.m4, m4/ulonglong.m4,
m4/xstrtoumax.m4, src/strtol.c, src/strtoul.c, src/strtoull.c,
src/strtoumax.c, src/xstrtol.c, src/xstrtol.h, src/xstrtoumax.c:
New files, taken unchanged from textutils, fileutils, sh-utils
and/or tar.
2000-03-23 Paul Eggert
* src/search.c (Pcompile): Add support for NUL bytes in
Perl regular expressions.
2000-03-23 Paul Eggert
* NEWS, doc/grep.1, doc/grep.texi: Change --pcre to --perl-regexp.
* src/grep.c (long_options, usage): Likewise.
* doc/grep.1, doc/grep.texi: Remove pgrep program.
* src/Makefile.am (bin_PROGRAMS): Likewise.
(pgrep_SOURCES): Remove.
* src/grep.c (main): Rename matcher from "pgrep" to "perl".
* src/search.c (matchers): Likewise.
* src/search.c: Do not include stdio.h; no longer needed.
(NILP): Remove.
(sub): No longer static.
(n_pcre): Remove.
(cre): No longer an array. Present only if HAVE_LIBPCRE.
(extra): New variable.
(Pcompile): Use fatal to report errors.
This also removes a possible core dump.
Add checks (marked FIXME) for restrictions in pcre.
Use pcre_maketables for proper localized behavior.
(Pcompile, Pexecute): Use GNU coding style.
The argument is a single pattern, not a list of patterns separated
by newlines; this is for consistency with grep and egrep.
Use pcre_study for speed.
(Pexecute): Abort if we lack pcre.
Abort if pcre_exec reports an impossible error.
Use code similar to the rest of search.c
to narrow down to the line we've found.
2000-03-21 Alain Magloire
* configure.in: added AC_CHECK_LIB(pcre, pcre_exec)
* ChangeLog: Typos corrected.
* src/search.c: new MACRO HAVE_LIBPCRE
2000-03-21 H.Merijn Brand
* src/Makefile.am(bin_PROGRAMS): added pgrep and new macro
pgrep_SOURCES.
* src/search.c: new functions Pcompile() and Pexecute()
to support PCRE. Update matcher[] array for pgrep.
* src/grep.c: new short and long option --pcre and -P.
usage() updated.
2000-03-21 Bastiaan Stougie
Improvement of the -m or --max-count option. Now works for NUM > 1 and
prints trailing context for the last matching line.
* src/grep.c
(after_last_match): Is a new off_t variable that replaces inputhwm
to retain the correct input offset even after a call to fillbuf. Note
that after_last_match has a different meaning than inputhwm:
it always points to the offset in the input of the first byte after
the last matching line, and is 0 if no matching line has been found
yet.
(grep): Print trailing context after the NUMth match when the -m NUM
option is used.
(grep): Added comment. Should have been commented already.
(grepbuf): Now updates outleft correctly. This fixes the bug that the
-m NUM option did not stop after NUM lines for NUM greater than 1.
(grepbuf, prtext): Now update after_last_match instead of inputhwm.
(fillbuf): No longer updates inputhwm.
(prpending): When outputting trailing context of the max_count-th
matching line, stop at the first matching line.
(grepfile): Seek to after_last_match or eof, depending on the values
of outleft and bufmapped.
(usage): added the -m or --max-count option to the help message.
* doc/grep.texi, doc/grep.1: Document the change of the -m option.
2000-03-17 Paul Eggert
Add new -m or --max-count option, based on a suggestion by
Bastiaan Stougie.
* doc/grep.texi, doc/grep.1: Document it.
* src/grep.c (short_options, long_options, main): Add it.
(inputhwm): New variable.
(fillbuf, prtext, grepbuf): Set it.
(bufmapped): Now a macro (defined to zero) if HAVE_MMAP is not defined.
(max_count, outleft): New variables.
(prtext, grepbuf, grep): Don't output more than outleft lines.
(grepfile): If grepping standard input, seek to the limit of what
we've read before exiting. This fixes a bug with mmapped input,
and is needed for proper -m support.
(main): Exit immediately if -m 0 is specified.
2000-03-08 Alain Magloire
* configure.in: version 2.4.2
@ -45,7 +1025,7 @@
2000-01-30 Alain Magloire
* doc/grep.1: corrected typo.
Noted by Ruslan Ermilob.
Noted by Ruslan Ermilov.
2000-01-30 Alain Magloire
@ -171,7 +1151,7 @@
2000-01-04 Paul Eggert
Inititial patch from David O'Brien.
Initial patch by Ruslan Ermilov.
Add --binary-files option.
* NEWS, doc/grep.1, doc/grep.texi: Document it.

View File

@ -1,3 +1,73 @@
Version 2.5.1
- This is a bugfix release. No new features.
Version 2.5
- The new option --label allows to specify a different name for input
from stdin. See the man or info pages for details.
- The internal lib/getopt* files are no longer used on systems providing
getopt functionality in their libc (e.g. glibc 2.2.x).
If you need the old getopt files, use --with-included-getopt.
- The new option --only-matching (-o) will print only the part of matching
lines that matches the pattern. This is useful, for example, to extract
IP addresses from log files.
- i18n bug fixed ([A-Z0-9] wouldn't match A in locales other than C on
systems using recent glibc builds
- GNU grep can now be built with autoconf 2.52.
- The new option --devices controls how grep handles device files. Its usage
is analogous to --directories.
- The new option --line-buffered fflush on everyline. There is a noticeable
slow down when forcing line buffering.
- Back references are now local to the regex.
grep -e '\(a\)\1' -e '\(b\)\1'
The last backref \1 in the second expression refer to \(b\)
- The new option --include=PATTERN will only search matching files
when recursing in directories
- The new option --exclude=PATTERN will skip matching files when
recursing in directories.
- The new option --color will use the environment variable GREP_COLOR
(default is red) to highlight the matching string.
--color takes an optional argument specifying when to colorize a line:
--color=always, --color=tty, --color=never
- The following changes are for POSIX.2 conformance:
. The -q or --quiet or --silent option now causes grep to exit
with zero status when a input line is selected, even if an error
also occurs.
. The -s or --no-messages option no longer affects the exit status.
. Bracket regular expressions like [a-z] are now locale-dependent.
For example, many locales sort characters in dictionary order,
and in these locales the regular expression [a-d] is not
equivalent to [abcd]; it might be equivalent to [aBbCcDd], for
example. To obtain the traditional interpretation of bracket
expressions, you can use the C locale by setting the LC_ALL
environment variable to the value "C".
- The -C or --context option now requires an argument, partly for
consistency, and partly because POSIX.2 recommends against
optional arguments.
- The new -P or --perl-regexp option tells grep to interpert the pattern as
a Perl regular expression.
- The new option --max-count=num makes grep stop reading a file after num
matching lines.
New option -m; equivalent to --max-count.
- Translations for bg, ca, da, nb and tr have been added.
Version 2.4.2
- Added more check in configure to default the grep-${version}/src/regex.c

View File

@ -1,50 +1,72 @@
Aharon Robbins <arnold@gnu.org>
Akim Demaille <akim@epita.fr>
Alain Magloire <alainm@gnu.org>
Andreas Schwab <schwab@suse.de>
Andreas Ley <andy@rz.uni-karlsruhe.de>
Ben Elliston <bje@cygnus.com>
David J MacKenzie <djm@catapult.va.pubnix.com>
David O'Brien <obrien@freebsd.org>
Eli Zaretskii <eliz@is.elta.co.il>
Florian La Roche <florian@knorke.saar.de>
Franc,ois Pinard <pinard@IRO.UMontreal.CA>
Grant McDorman <grant@isgtec.com>
Harald Hanche-Olsen <hanche@math.ntnu.no>
Jeff Bailey <jbailey@nisa.net>
Jim Hand <jhand@austx.tandem.com>
Jim Meyering <meyering@asic.sc.ti.com>
Jochen Hein <jochen.hein@delphi.central.de>
Joel N. Weber II <devnull@gnu.org>
John Hughes <john@nitelite.calvacom.fr>
Jorge Stolfi <stolfi@dcc.unicamp.br>
Karl Berry <karl@cs.umb.edu>
Karl Heuer <kwzh@gnu.org>
Kaveh R. Ghazi <ghazi@caip.rutgers.edu>
Kazuro Furukawa <furukawa@apricot.kek.jp>
Keith Bostic <bostic@bsdi.com>
Krishna Sethuraman <krishna@sgihub.corp.sgi.com>
Mark Waite <markw@mddmew.fc.hp.com>
Martin P.J. Zinser <zinser@decus.de>
Martin Rex <martin.rex@sap-ag.de>
Michael Aichlmayr <mikla@nx.com>
Miles Bader <miles@ccs.mt.nec.co.jp>
Olaf Kirch <okir@ns.lst.de>
Paul Eggert <eggert@twinsun.com>
Paul Kimoto <kimoto@spacenet.tn.cornell.edu>
Phillip C. Brisco <phillip.craig.brisco@ccmail.census.gov>
Philippe Defert <Philippe.Defert@cern.ch>
Philippe De Muyter <phdm@info.ucl.ac.be>
Roland Roberts <rroberts@muller.com>
Ruslan Ermilov <ru@freebsd.org>
Shannon Hill <hill@synnet.com>
Sotiris Vassilopoulos <Sotiris.Vassilopoulos@betatech.gr>
Stewart Levin <stew@sep.stanford.edu>
Sydoruk Stepan <step@unitex.kiev.ua>
Tom 'moof' Spindler <dogcow@ccs.neu.edu>
Tom Tromey <tromey@creche.cygnus.com>
Ulrich Drepper <drepper@cygnus.com>
UEBAYASHI Masao <masao@nf.enveng.titech.ac.jp>
Volker Borchert <bt@teknon.de>
Wichert Akkerman <wakkerma@wi.leidenuniv.nl>
William Bader <william@nscs.fast.net>
Aharon Robbins <arnold@gnu.org>
Akim Demaille <akim@epita.fr>
Alain Magloire <alainm@gnu.org>
Andreas Schwab <schwab@suse.de>
Andreas Ley <andy@rz.uni-karlsruhe.de>
Bastiaan "Darquan" Stougie <darquan@zonnet.nl>
Ben Elliston <bje@cygnus.com>
Bernd Strieder <strieder@student.uni-kl.de>
Bernhard Rosenkraenzer <bero@redhat.com>
Bob Proulx <rwp@hprwp.fc.hp.com>
Brian Youmans <3diff@gnu.org>
Bruno Haible <haible@ilog.fr>
Christian Groessler <cpg@aladdin.de>
David Clissold <cliss@austin.ibm.com>
David J MacKenzie <djm@catapult.va.pubnix.com>
David O'Brien <obrien@freebsd.org>
Eli Zaretskii <eliz@is.elta.co.il>
Florian La Roche <laroche@redhat.com>
Franc,ois Pinard <pinard@IRO.UMontreal.CA>
Gerald Stoller <gerald_stoller@hotmail.com>
Grant McDorman <grant@isgtec.com>
Greg Louis <glouis@dynamicro.on.ca>
Guglielmo 'bond' Bondioni <g.bondioni@libero.it>
H. Merijn Brand <h.m.brand@hccnet.nl>
Harald Hanche-Olsen <hanche@math.ntnu.no>
Hans-Bernhard Broeker <broeker@physik.rwth-aachen.de>
Heikki Korpela <heko@iki.fi>
Isamu Hasegawa <isamu@yamato.ibm.com>
Jeff Bailey <jbailey@nisa.net>
Jim Hand <jhand@austx.tandem.com>
Jim Meyering <meyering@asic.sc.ti.com>
Jochen Hein <jochen.hein@delphi.central.de>
Joel N. Weber II <devnull@gnu.org>
John Hughes <john@nitelite.calvacom.fr>
Jorge Stolfi <stolfi@dcc.unicamp.br>
Juan Manuel Guerrero <ST001906@HRZ1.HRZ.TU-Darmstadt.De>
Karl Berry <karl@cs.umb.edu>
Karl Heuer <kwzh@gnu.org>
Kaveh R. Ghazi <ghazi@caip.rutgers.edu>
Kazuro Furukawa <furukawa@apricot.kek.jp>
Keith Bostic <bostic@bsdi.com>
Krishna Sethuraman <krishna@sgihub.corp.sgi.com>
Kurt D Schwehr <kdschweh@insci14.ucsd.edu>
Mark Waite <markw@mddmew.fc.hp.com>
Martin P.J. Zinser <zinser@decus.de>
Martin Rex <martin.rex@sap-ag.de>
Michael Aichlmayr <mikla@nx.com>
Miles Bader <miles@ccs.mt.nec.co.jp>
Olaf Kirch <okir@ns.lst.de>
Paul Eggert <eggert@twinsun.com>
Paul Kimoto <kimoto@spacenet.tn.cornell.edu>
Phillip C. Brisco <phillip.craig.brisco@ccmail.census.gov>
Philippe Defert <Philippe.Defert@cern.ch>
Philippe De Muyter <phdm@info.ucl.ac.be>
Philip Hazel <ph10@cus.cam.ac.uk>
Roland Roberts <rroberts@muller.com>
Ruslan Ermilov <ru@freebsd.org>
Santiago Vila <sanvila@unex.es>
Shannon Hill <hill@synnet.com>
Sotiris Vassilopoulos <Sotiris.Vassilopoulos@betatech.gr>
Stewart Levin <stew@sep.stanford.edu>
Sydoruk Stepan <step@unitex.kiev.ua>
Tapani Tarvainen <tt@mit.jyu.fi>
Tom 'moof' Spindler <dogcow@ccs.neu.edu>
Tom Tromey <tromey@creche.cygnus.com>
Ulrich Drepper <drepper@cygnus.com>
UEBAYASHI Masao <masao@nf.enveng.titech.ac.jp>
Uwe H. Steinfeld <usteinfeld@gmx.net>
Volker Borchert <bt@teknon.de>
Wichert Akkerman <wichert@cistron.nl>
William Bader <william@nscs.fast.net>
Wolfgang Schludi <schludi@syscomp.de>

121
gnu/usr.bin/grep/closeout.c Normal file
View File

@ -0,0 +1,121 @@
/* closeout.c - close standard output
Copyright (C) 1998, 1999, 2000, 2001 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */
#if HAVE_CONFIG_H
# include <config.h>
#endif
#if ENABLE_NLS
# include <libintl.h>
# define _(Text) gettext (Text)
#else
# define _(Text) Text
#endif
#if HAVE_STDLIB_H
# include <stdlib.h>
#endif
#ifndef EXIT_FAILURE
# define EXIT_FAILURE 1
#endif
#include <stdio.h>
#include <errno.h>
#ifndef errno
extern int errno;
#endif
#include "closeout.h"
#include "error.h"
#include "quotearg.h"
#if 0
#include "__fpending.h"
#endif
static int default_exit_status = EXIT_FAILURE;
static const char *file_name;
/* Set the value to be used for the exit status when close_stdout is called.
This is useful when it is not convenient to call close_stdout_status,
e.g., when close_stdout is called via atexit. */
void
close_stdout_set_status (int status)
{
default_exit_status = status;
}
/* Set the file name to be reported in the event an error is detected
by close_stdout_status. */
void
close_stdout_set_file_name (const char *file)
{
file_name = file;
}
/* Close standard output, exiting with status STATUS on failure.
If a program writes *anything* to stdout, that program should `fflush'
stdout and make sure that it succeeds before exiting. Otherwise,
suppose that you go to the extreme of checking the return status
of every function that does an explicit write to stdout. The last
printf can succeed in writing to the internal stream buffer, and yet
the fclose(stdout) could still fail (due e.g., to a disk full error)
when it tries to write out that buffered data. Thus, you would be
left with an incomplete output file and the offending program would
exit successfully.
FIXME: note the fflush suggested above is implicit in the fclose
we actually do below. Consider doing only the fflush and/or using
setvbuf to inhibit buffering.
Besides, it's wasteful to check the return value from every call
that writes to stdout -- just let the internal stream state record
the failure. That's what the ferror test is checking below.
It's important to detect such failures and exit nonzero because many
tools (most notably `make' and other build-management systems) depend
on being able to detect failure in other tools via their exit status. */
void
close_stdout_status (int status)
{
int e = ferror (stdout) ? 0 : -1;
#if 0
if (__fpending (stdout) == 0)
return;
#endif
if (fclose (stdout) != 0)
e = errno;
if (0 < e)
{
char const *write_error = _("write error");
if (file_name)
error (status, e, "%s: %s", quotearg_colon (file_name), write_error);
else
error (status, e, "%s", write_error);
}
}
/* Close standard output, exiting with status EXIT_FAILURE on failure. */
void
close_stdout (void)
{
close_stdout_status (default_exit_status);
}

View File

@ -0,0 +1,17 @@
#ifndef CLOSEOUT_H
# define CLOSEOUT_H 1
# ifndef PARAMS
# if defined PROTOTYPES || (defined __STDC__ && __STDC__)
# define PARAMS(Args) Args
# else
# define PARAMS(Args) ()
# endif
# endif
void close_stdout_set_status PARAMS ((int status));
void close_stdout_set_file_name PARAMS ((const char *file));
void close_stdout PARAMS ((void));
void close_stdout_status PARAMS ((int status));
#endif

File diff suppressed because it is too large Load Diff

View File

@ -22,18 +22,24 @@
In addition to clobbering modularity, we eat up valuable
name space. */
# undef PARAMS
#if __STDC__
#ifdef __STDC__
# ifndef _PTR_T
# define _PTR_T
typedef void * ptr_t;
# endif
# define PARAMS(x) x
#else
# ifndef _PTR_T
# define _PTR_T
typedef char * ptr_t;
# endif
#endif
#ifdef PARAMS
# undef PARAMS
#endif
#if PROTOTYPES
# define PARAMS(x) x
#else
# define PARAMS(x) ()
#endif
@ -136,6 +142,21 @@ typedef enum
RPAREN, /* RPAREN never appears in the parse tree. */
CRANGE, /* CRANGE never appears in the parse tree.
It stands for a character range that can
match a string of one or more characters.
For example, [a-z] can match "ch" in
a Spanish locale. */
#ifdef MBS_SUPPORT
ANYCHAR, /* ANYCHAR is a terminal symbol that matches
any multibyte(or singlebyte) characters.
It is used only if MB_CUR_MAX > 1. */
MBCSET, /* MBCSET is similar to CSET, but for
multibyte characters. */
#endif /* MBS_SUPPORT */
CSET /* CSET and (and any value greater) is a
terminal symbol that matches any of a
class of characters. */
@ -223,6 +244,12 @@ typedef struct
char backref; /* True if this state matches a \<digit>. */
unsigned char constraint; /* Constraint for this state to accept. */
int first_end; /* Token value of the first END in elems. */
#ifdef MBS_SUPPORT
position_set mbps; /* Positions which can match multibyte
characters. e.g. period.
These staff are used only if
MB_CUR_MAX > 1. */
#endif
} dfa_state;
/* Element of a list of strings, at least one of which is known to
@ -234,6 +261,26 @@ struct dfamust
struct dfamust *next;
};
#ifdef MBS_SUPPORT
/* A bracket operator.
e.g. [a-c], [[:alpha:]], etc. */
struct mb_char_classes
{
int invert;
wchar_t *chars; /* Normal characters. */
int nchars;
wctype_t *ch_classes; /* Character classes. */
int nch_classes;
wchar_t *range_sts; /* Range characters (start of the range). */
wchar_t *range_ends; /* Range characters (end of the range). */
int nranges;
char **equivs; /* Equivalent classes. */
int nequivs;
char **coll_elems;
int ncoll_elems; /* Collating elements. */
};
#endif
/* A compiled regular expression. */
struct dfa
{
@ -252,6 +299,32 @@ struct dfa
int nleaves; /* Number of leaves on the parse tree. */
int nregexps; /* Count of parallel regexps being built
with dfaparse(). */
#ifdef MBS_SUPPORT
/* These stuff are used only if MB_CUR_MAX > 1 or multibyte environments. */
int nmultibyte_prop;
int *multibyte_prop;
/* The value of multibyte_prop[i] is defined by following rule.
if tokens[i] < NOTCHAR
bit 1 : tokens[i] is a singlebyte character, or the last-byte of
a multibyte character.
bit 0 : tokens[i] is a singlebyte character, or the 1st-byte of
a multibyte character.
if tokens[i] = MBCSET
("the index of mbcsets correspnd to this operator" << 2) + 3
e.g.
tokens
= 'single_byte_a', 'multi_byte_A', single_byte_b'
= 'sb_a', 'mb_A(1st byte)', 'mb_A(2nd byte)', 'mb_A(3rd byte)', 'sb_b'
multibyte_prop
= 3 , 1 , 0 , 2 , 3
*/
/* Array of the bracket expressoin in the DFA. */
struct mb_char_classes *mbcsets;
int nmbcsets;
int mbcsets_alloc;
#endif
/* Stuff owned by the state builder. */
dfa_state *states; /* States of the dfa. */
@ -290,13 +363,6 @@ struct dfa
on a state that potentially could do so. */
int *success; /* Table of acceptance conditions used in
dfaexec and computed in build_state. */
int *newlines; /* Transitions on newlines. The entry for a
newline in any transition table is always
-1 so we can count lines without wasting
too many cycles. The transition for a
newline is stored separately and handled
as a special case. Newline is also used
as a sentinel at the end of the buffer. */
struct dfamust *musts; /* List of strings, at least one of which
is known to appear in any r.e. matching
the dfa. */
@ -323,26 +389,21 @@ struct dfa
/* dfasyntax() takes three arguments; the first sets the syntax bits described
earlier in this file, the second sets the case-folding flag, and the
third specifies the line terminator. */
extern void dfasyntax PARAMS ((reg_syntax_t, int, int));
extern void dfasyntax PARAMS ((reg_syntax_t, int, unsigned char));
/* Compile the given string of the given length into the given struct dfa.
Final argument is a flag specifying whether to build a searching or an
exact matcher. */
extern void dfacomp PARAMS ((char *, size_t, struct dfa *, int));
extern void dfacomp PARAMS ((char const *, size_t, struct dfa *, int));
/* Execute the given struct dfa on the buffer of characters. The
first char * points to the beginning, and the second points to the
first character after the end of the buffer, which must be a writable
place so a sentinel end-of-buffer marker can be stored there. The
second-to-last argument is a flag telling whether to allow newlines to
be part of a string matching the regexp. The next-to-last argument,
if non-NULL, points to a place to increment every time we see a
newline. The final argument, if non-NULL, points to a flag that will
last byte of the buffer must equal the end-of-line byte.
The final argument points to a flag that will
be set if further examination by a backtracking matcher is needed in
order to verify backreferencing; otherwise the flag will be cleared.
Returns NULL if no match is found, or a pointer to the first
Returns (size_t) -1 if no match is found, or the offset of the first
character after the first & shortest matching string in the buffer. */
extern char *dfaexec PARAMS ((struct dfa *, char *, char *, int, int *, int *));
extern size_t dfaexec PARAMS ((struct dfa *, char const *, size_t, int *));
/* Free the storage held by the components of a struct dfa. */
extern void dfafree PARAMS ((struct dfa *));
@ -353,7 +414,7 @@ extern void dfafree PARAMS ((struct dfa *));
extern void dfainit PARAMS ((struct dfa *));
/* Incrementally parse a string of given length into a struct dfa. */
extern void dfaparse PARAMS ((char *, size_t, struct dfa *));
extern void dfaparse PARAMS ((char const *, size_t, struct dfa *));
/* Analyze a parsed regexp; second argument tells whether to build a searching
or an exact matcher. */
@ -367,6 +428,5 @@ extern void dfastate PARAMS ((int, struct dfa *, int []));
/* dfaerror() is called by the regexp routines whenever an error occurs. It
takes a single argument, a NUL-terminated string describing the error.
The default dfaerror() prints the error message to stderr and exits.
The user can provide a different dfafree() if so desired. */
The user must supply a dfaerror. */
extern void dfaerror PARAMS ((const char *));

File diff suppressed because it is too large Load Diff

View File

@ -1,3 +1,4 @@
@set UPDATED 2 February 2000
@set EDITION 2.4.2
@set VERSION 2.4.2
@set UPDATED 23 January 2002
@set UPDATED-MONTH January 2002
@set EDITION 2.5.1
@set VERSION 2.5.1

276
gnu/usr.bin/grep/error.c Normal file
View File

@ -0,0 +1,276 @@
/* Error handler for noninteractive utilities
Copyright (C) 1990-1998, 2000 Free Software Foundation, Inc.
This file is part of the GNU C Library. Its master source is NOT part of
the C library, however. The master source lives in /gd/gnu/lib.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Library General Public License as
published by the Free Software Foundation; either version 2 of the
License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Library General Public License for more details.
You should have received a copy of the GNU Library General Public
License along with the GNU C Library; see the file COPYING.LIB. If not,
write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
Boston, MA 02111-1307, USA. */
/* Written by David MacKenzie <djm@gnu.ai.mit.edu>. */
#ifdef HAVE_CONFIG_H
# include <config.h>
#endif
#include <stdio.h>
#if HAVE_LIBINTL_H
# include <libintl.h>
#endif
#if HAVE_VPRINTF || HAVE_DOPRNT || _LIBC
# if __STDC__
# include <stdarg.h>
# define VA_START(args, lastarg) va_start(args, lastarg)
# else
# include <varargs.h>
# define VA_START(args, lastarg) va_start(args)
# endif
#else
# define va_alist a1, a2, a3, a4, a5, a6, a7, a8
# define va_dcl char *a1, *a2, *a3, *a4, *a5, *a6, *a7, *a8;
#endif
#if STDC_HEADERS || _LIBC
# include <stdlib.h>
# include <string.h>
#else
void exit ();
#endif
#include "error.h"
#ifndef HAVE_DECL_STRERROR_R
"this configure-time declaration test was not run"
#endif
#if !HAVE_DECL_STRERROR_R
char *strerror_r ();
#endif
#ifndef _
# define _(String) String
#endif
/* If NULL, error will flush stdout, then print on stderr the program
name, a colon and a space. Otherwise, error will call this
function without parameters instead. */
void (*error_print_progname) (
#if __STDC__ - 0
void
#endif
);
/* This variable is incremented each time `error' is called. */
unsigned int error_message_count;
#ifdef _LIBC
/* In the GNU C library, there is a predefined variable for this. */
# define program_name program_invocation_name
# include <errno.h>
/* In GNU libc we want do not want to use the common name `error' directly.
Instead make it a weak alias. */
# define error __error
# define error_at_line __error_at_line
# ifdef USE_IN_LIBIO
# include <libio/iolibio.h>
# define fflush(s) _IO_fflush (s)
# endif
#else /* not _LIBC */
/* The calling program should define program_name and set it to the
name of the executing program. */
extern char *program_name;
# ifdef HAVE_STRERROR_R
# define __strerror_r strerror_r
# else
# if HAVE_STRERROR
# ifndef strerror /* On some systems, strerror is a macro */
char *strerror ();
# endif
# else
static char *
private_strerror (errnum)
int errnum;
{
extern char *sys_errlist[];
extern int sys_nerr;
if (errnum > 0 && errnum <= sys_nerr)
return _(sys_errlist[errnum]);
return _("Unknown system error");
}
# define strerror private_strerror
# endif /* HAVE_STRERROR */
# endif /* HAVE_STRERROR_R */
#endif /* not _LIBC */
/* Print the program name and error message MESSAGE, which is a printf-style
format string with optional args.
If ERRNUM is nonzero, print its corresponding system error message.
Exit with status STATUS if it is nonzero. */
/* VARARGS */
void
#if defined VA_START && __STDC__
error (int status, int errnum, const char *message, ...)
#else
error (status, errnum, message, va_alist)
int status;
int errnum;
char *message;
va_dcl
#endif
{
#ifdef VA_START
va_list args;
#endif
if (error_print_progname)
(*error_print_progname) ();
else
{
fflush (stdout);
fprintf (stderr, "%s: ", program_name);
}
#ifdef VA_START
VA_START (args, message);
# if HAVE_VPRINTF || _LIBC
vfprintf (stderr, message, args);
# else
_doprnt (message, args, stderr);
# endif
va_end (args);
#else
fprintf (stderr, message, a1, a2, a3, a4, a5, a6, a7, a8);
#endif
++error_message_count;
if (errnum)
{
#if defined HAVE_STRERROR_R || _LIBC
char errbuf[1024];
# if HAVE_WORKING_STRERROR_R || _LIBC
fprintf (stderr, ": %s", __strerror_r (errnum, errbuf, sizeof errbuf));
# else
/* Don't use __strerror_r's return value because on some systems
(at least DEC UNIX 4.0[A-D]) strerror_r returns `int'. */
__strerror_r (errnum, errbuf, sizeof errbuf);
fprintf (stderr, ": %s", errbuf);
# endif
#else
fprintf (stderr, ": %s", strerror (errnum));
#endif
}
putc ('\n', stderr);
fflush (stderr);
if (status)
exit (status);
}
/* Sometimes we want to have at most one error per line. This
variable controls whether this mode is selected or not. */
int error_one_per_line;
void
#if defined VA_START && __STDC__
error_at_line (int status, int errnum, const char *file_name,
unsigned int line_number, const char *message, ...)
#else
error_at_line (status, errnum, file_name, line_number, message, va_alist)
int status;
int errnum;
const char *file_name;
unsigned int line_number;
char *message;
va_dcl
#endif
{
#ifdef VA_START
va_list args;
#endif
if (error_one_per_line)
{
static const char *old_file_name;
static unsigned int old_line_number;
if (old_line_number == line_number &&
(file_name == old_file_name || !strcmp (old_file_name, file_name)))
/* Simply return and print nothing. */
return;
old_file_name = file_name;
old_line_number = line_number;
}
if (error_print_progname)
(*error_print_progname) ();
else
{
fflush (stdout);
fprintf (stderr, "%s:", program_name);
}
if (file_name != NULL)
fprintf (stderr, "%s:%d: ", file_name, line_number);
#ifdef VA_START
VA_START (args, message);
# if HAVE_VPRINTF || _LIBC
vfprintf (stderr, message, args);
# else
_doprnt (message, args, stderr);
# endif
va_end (args);
#else
fprintf (stderr, message, a1, a2, a3, a4, a5, a6, a7, a8);
#endif
++error_message_count;
if (errnum)
{
#if defined HAVE_STRERROR_R || _LIBC
char errbuf[1024];
# if HAVE_WORKING_STRERROR_R || _LIBC
fprintf (stderr, ": %s", __strerror_r (errnum, errbuf, sizeof errbuf));
# else
/* Don't use __strerror_r's return value because on some systems
(at least DEC UNIX 4.0[A-D]) strerror_r returns `int'. */
__strerror_r (errnum, errbuf, sizeof errbuf);
fprintf (stderr, ": %s", errbuf);
# endif
#else
fprintf (stderr, ": %s", strerror (errnum));
#endif
}
putc ('\n', stderr);
fflush (stderr);
if (status)
exit (status);
}
#ifdef _LIBC
/* Make the weak alias. */
# undef error
# undef error_at_line
weak_alias (__error, error)
weak_alias (__error_at_line, error_at_line)
#endif

78
gnu/usr.bin/grep/error.h Normal file
View File

@ -0,0 +1,78 @@
/* Declaration for error-reporting function
Copyright (C) 1995, 1996, 1997 Free Software Foundation, Inc.
NOTE: The canonical source of this file is maintained with the GNU C Library.
Bugs can be reported to bug-glibc@prep.ai.mit.edu.
This program is free software; you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2, or (at your option) any
later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307,
USA. */
#ifndef _ERROR_H
#define _ERROR_H 1
#ifndef __attribute__
/* This feature is available in gcc versions 2.5 and later. */
# if __GNUC__ < 2 || (__GNUC__ == 2 && __GNUC_MINOR__ < 5) || __STRICT_ANSI__
# define __attribute__(Spec) /* empty */
# endif
/* The __-protected variants of `format' and `printf' attributes
are accepted by gcc versions 2.6.4 (effectively 2.7) and later. */
# if __GNUC__ < 2 || (__GNUC__ == 2 && __GNUC_MINOR__ < 7)
# define __format__ format
# define __printf__ printf
# endif
#endif
#ifdef __cplusplus
extern "C" {
#endif
#if defined (__STDC__) && __STDC__
/* Print a message with `fprintf (stderr, FORMAT, ...)';
if ERRNUM is nonzero, follow it with ": " and strerror (ERRNUM).
If STATUS is nonzero, terminate the program with `exit (STATUS)'. */
extern void error (int status, int errnum, const char *format, ...)
__attribute__ ((__format__ (__printf__, 3, 4)));
extern void error_at_line (int status, int errnum, const char *fname,
unsigned int lineno, const char *format, ...)
__attribute__ ((__format__ (__printf__, 5, 6)));
/* If NULL, error will flush stdout, then print on stderr the program
name, a colon and a space. Otherwise, error will call this
function without parameters instead. */
extern void (*error_print_progname) (void);
#else
void error ();
void error_at_line ();
extern void (*error_print_progname) ();
#endif
/* This variable is incremented each time `error' is called. */
extern unsigned int error_message_count;
/* Sometimes we want to have at most one error per line. This
variable controls whether this mode is selected or not. */
extern int error_one_per_line;
#ifdef __cplusplus
}
#endif
#endif /* error.h */

128
gnu/usr.bin/grep/exclude.c Normal file
View File

@ -0,0 +1,128 @@
/* exclude.c -- exclude file names
Copyright 1992, 1993, 1994, 1997, 1999, 2000 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; see the file COPYING.
If not, write to the Free Software Foundation,
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */
/* Written by Paul Eggert <eggert@twinsun.com> */
#if HAVE_CONFIG_H
# include <config.h>
#endif
#include <errno.h>
#ifndef errno
extern int errno;
#endif
#include <exclude.h>
#include <fnmatch.h>
#include <stdio.h>
#include <sys/types.h>
void *xmalloc PARAMS ((size_t));
void *xrealloc PARAMS ((void *, size_t));
/* Keep track of excluded file name patterns. */
struct exclude
{
char const **exclude;
int exclude_alloc;
int exclude_count;
};
struct exclude *
new_exclude (void)
{
struct exclude *ex = (struct exclude *) xmalloc (sizeof (struct exclude));
ex->exclude_count = 0;
ex->exclude_alloc = 64;
ex->exclude = (char const **) xmalloc (ex->exclude_alloc * sizeof (char *));
return ex;
}
int
excluded_filename (struct exclude const *ex, char const *f, int options)
{
char const * const *exclude = ex->exclude;
int exclude_count = ex->exclude_count;
int i;
for (i = 0; i < exclude_count; i++)
if (fnmatch (exclude[i], f, options) == 0)
return 1;
return 0;
}
void
add_exclude (struct exclude *ex, char const *pattern)
{
if (ex->exclude_alloc <= ex->exclude_count)
ex->exclude = (char const **) xrealloc (ex->exclude,
((ex->exclude_alloc *= 2)
* sizeof (char *)));
ex->exclude[ex->exclude_count++] = pattern;
}
int
add_exclude_file (void (*add_func) PARAMS ((struct exclude *, char const *)),
struct exclude *ex, char const *filename, char line_end)
{
int use_stdin = filename[0] == '-' && !filename[1];
FILE *in;
char *buf;
char *p;
char const *pattern;
char const *lim;
size_t buf_alloc = 1024;
size_t buf_count = 0;
int c;
int e = 0;
if (use_stdin)
in = stdin;
else if (! (in = fopen (filename, "r")))
return -1;
buf = xmalloc (buf_alloc);
while ((c = getc (in)) != EOF)
{
buf[buf_count++] = c;
if (buf_count == buf_alloc)
buf = xrealloc (buf, buf_alloc *= 2);
}
buf = xrealloc (buf, buf_count + 1);
if (ferror (in))
e = errno;
if (!use_stdin && fclose (in) != 0)
e = errno;
for (pattern = p = buf, lim = buf + buf_count; p <= lim; p++)
if (p < lim ? *p == line_end : buf < p && p[-1])
{
*p = '\0';
(*add_func) (ex, pattern);
pattern = p + 1;
}
errno = e;
return e ? -1 : 0;
}

View File

@ -0,0 +1,35 @@
/* exclude.h -- declarations for excluding file names
Copyright 1992, 1993, 1994, 1997, 1999 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; see the file COPYING.
If not, write to the Free Software Foundation,
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */
/* Written by Paul Eggert <eggert@twinsun.com> */
#ifndef PARAMS
# if defined PROTOTYPES || (defined __STDC__ && __STDC__)
# define PARAMS(Args) Args
# else
# define PARAMS(Args) ()
# endif
#endif
struct exclude;
struct exclude *new_exclude PARAMS ((void));
void add_exclude PARAMS ((struct exclude *, char const *));
int add_exclude_file PARAMS ((void (*) (struct exclude *, char const *),
struct exclude *, char const *, char));
int excluded_filename PARAMS ((struct exclude const *, char const *, int));

View File

@ -2,6 +2,11 @@
#ifndef HAVE_GETPAGESIZE
#if !defined getpagesize && defined __BEOS__
# include <OS.h>
# define getpagesize() B_PAGE_SIZE
#endif
#ifdef HAVE_UNISTD_H
# include <unistd.h>
#endif

View File

@ -12,7 +12,7 @@
.de Id
.ds Dt \\$4
..
.Id $Id: grep.1,v 1.11 2000/02/26 03:18:40 alainm Exp $
.Id $Id: grep.1,v 1.23 2002/01/22 13:20:04 bero Exp $
.TH GREP 1 \*(Dt "GNU Project"
.SH NAME
grep, egrep, fgrep \- print lines matching a pattern
@ -62,6 +62,9 @@ is the same as
Print
.I NUM
lines of trailing context after matching lines.
Places a line containing
.B \-\^\-
between contiguous groups of matches.
.TP
.BR \-a ", " \-\^\-text
Process a binary file as if it were text; this is equivalent to the
@ -72,11 +75,17 @@ option.
Print
.I NUM
lines of leading context before matching lines.
Places a line containing
.B \-\^\-
between contiguous groups of matches.
.TP
\fB\-C\fP [\fINUM\fP], \fB\-\fP\fINUM\fP, \fB\-\^\-context\fP[\fB=\fP\fINUM\fP]
.BI \-C " NUM" "\fR,\fP \-\^\-context=" NUM
Print
.I NUM
lines (default 2) of output context.
lines of output context.
Places a line containing
.B \-\^\-
between contiguous groups of matches.
.TP
.BR \-b ", " \-\^\-byte-offset
Print the byte offset within the input file before
@ -117,6 +126,11 @@ might output binary garbage,
which can have nasty side effects if the output is a terminal and if the
terminal driver interprets some of it as commands.
.TP
.BI \-\^\-colour[=\fIWHEN\fR] ", " \-\^\-color[=\fIWHEN\fR]
Surround the matching string with the marker find in
.B GREP_COLOR
environment variable. WHEN may be `never', `always', or `auto'
.TP
.BR \-c ", " \-\^\-count
Suppress normal output; instead print a count of
matching lines for each input file.
@ -124,6 +138,20 @@ With the
.BR \-v ", " \-\^\-invert-match
option (see below), count non-matching lines.
.TP
.BI \-D " ACTION" "\fR,\fP \-\^\-devices=" ACTION
If an input file is a device, FIFO or socket, use
.I ACTION
to process it. By default,
.I ACTION
is
.BR read ,
which means that devices are read just as if they were ordinary files.
If
.I ACTION
is
.BR skip ,
devices are silently skipped.
.TP
.BI \-d " ACTION" "\fR,\fP \-\^\-directories=" ACTION
If an input file is a directory, use
.I ACTION
@ -163,6 +191,10 @@ Interpret
.I PATTERN
as a list of fixed strings, separated by newlines,
any of which is to be matched.
.BR \-P ", " \-\^\-perl-regexp
Interpret
.I PATTERN
as a Perl regular expression.
.TP
.BI \-f " FILE" "\fR,\fP \-\^\-file=" FILE
Obtain patterns from
@ -208,6 +240,39 @@ the name of each input file from which output
would normally have been printed. The scanning will
stop on the first match.
.TP
.BI \-m " NUM" "\fR,\fP \-\^\-max-count=" NUM
Stop reading a file after
.I NUM
matching lines. If the input is standard input from a regular file,
and
.I NUM
matching lines are output,
.B grep
ensures that the standard input is positioned to just after the last
matching line before exiting, regardless of the presence of trailing
context lines. This enables a calling process to resume a search.
When
.B grep
stops after
.I NUM
matching lines, it outputs any trailing context lines. When the
.B \-c
or
.B \-\^\-count
option is also used,
.B grep
does not output a count greater than
.IR NUM .
When the
.B \-v
or
.B \-\^\-invert-match
option is also used,
.B grep
stops after outputting
.I NUM
non-matching lines.
.TP
.B \-\^\-mmap
If possible, use the
.BR mmap (2)
@ -227,21 +292,43 @@ is operating, or if an I/O error occurs.
Prefix each line of output with the line number
within its input file.
.TP
.BR \-o ", " \-\^\-only-matching
Show only the part of a matching line that matches
.I PATTERN.
.TP
.BI \-\^\-label= LABEL
Displays input actually coming from standard input as input coming from file
.I LABEL.
This is especially useful for tools like zgrep, e.g.
.B "gzip -cd foo.gz |grep --label=foo something"
.TP
.BR \-\^\-line-buffering
Use line buffering, it can be a performance penality.
.TP
.BR \-q ", " \-\^\-quiet ", " \-\^\-silent
Quiet; suppress normal output. The scanning will stop
on the first match.
Quiet; do not write anything to standard output.
Exit immediately with zero status if any match is found,
even if an error was detected.
Also see the
.B \-s
or
.B \-\^\-no-messages
option below.
option.
.TP
.BR \-r ", " \-\^\-recursive
.BR \-R ", " \-r ", " \-\^\-recursive
Read all files under each directory, recursively;
this is equivalent to the
.B "\-d recurse"
option.
.TP
.BR "\fR \fP \-\^\-include=" PATTERN
Recurse in directories only searching file matching
.I PATTERN.
.TP
.BR "\fR \fP \-\^\-exclude=" PATTERN
Recurse in directories skip file matching
.I PATTERN.
.TP
.BR \-s ", " \-\^\-no-messages
Suppress error messages about nonexistent or unreadable files.
Portability note: unlike \s-1GNU\s0
@ -358,11 +445,13 @@ a single character. Most characters, including all letters and digits,
are regular expressions that match themselves. Any metacharacter with
special meaning may be quoted by preceding it with a backslash.
.PP
A list of characters enclosed by
A
.I "bracket expression"
is a list of characters enclosed by
.B [
and
.B ]
matches any single
.BR ] .
It matches any single
character in that list; if the first character of the list
is the caret
.B ^
@ -371,10 +460,32 @@ then it matches any character
in the list.
For example, the regular expression
.B [0123456789]
matches any single digit. A range of characters
may be specified by giving the first and last characters, separated
by a hyphen.
Finally, certain named classes of characters are predefined.
matches any single digit.
.PP
Within a bracket expression, a
.I "range expression"
consists of two characters separated by a hyphen.
It matches any single character that sorts between the two characters,
inclusive, using the locale's collating sequence and character set.
For example, in the default C locale,
.B [a\-d]
is equivalent to
.BR [abcd] .
Many locales sort characters in dictionary order, and in these locales
.B [a\-d]
is typically not equivalent to
.BR [abcd] ;
it might be equivalent to
.BR [aBbCcDd] ,
for example.
To obtain the traditional interpretation of bracket expressions,
you can use the C locale by setting the
.B LC_ALL
environment variable to the value
.BR C .
.PP
Finally, certain named classes of characters are predefined within
bracket expressions, as follows.
Their names are self explanatory, and they are
.BR [:alnum:] ,
.BR [:alpha:] ,
@ -391,8 +502,8 @@ and
For example,
.B [[:alnum:]]
means
.BR [0-9A-Za-z] ,
except the latter form depends upon the \s-1POSIX\s0 locale and the
.BR [0\-9A\-Za\-z] ,
except the latter form depends upon the C locale and the
\s-1ASCII\s0 character encoding, whereas the former is independent
of locale and character set.
(Note that the brackets in these class names are part of the symbolic
@ -539,6 +650,29 @@ instead of reporting a syntax error in the regular expression.
\s-1POSIX.2\s0 allows this behavior as an extension, but portable scripts
should avoid it.
.SH "ENVIRONMENT VARIABLES"
Grep's behavior is affected by the following environment variables.
.PP
A locale
.BI LC_ foo
is specified by examining the three environment variables
.BR LC_ALL ,
.BR LC_\fIfoo\fP ,
.BR LANG ,
in that order.
The first of these variables that is set specifies the locale.
For example, if
.B LC_ALL
is not set, but
.B LC_MESSAGES
is set to
.BR pt_BR ,
then Brazilian Portuguese is used for the
.B LC_MESSAGES
locale.
The C locale is used if none of these environment variables are set,
or if the locale catalog is not installed, or if
.B grep
was not compiled with national language support (\s-1NLS\s0).
.TP
.B GREP_OPTIONS
This variable specifies default options to be placed in front of any
@ -556,28 +690,29 @@ Option specifications are separated by whitespace.
A backslash escapes the next character,
so it can be used to specify an option containing whitespace or a backslash.
.TP
\fBLC_ALL\fP, \fBLC_MESSAGES\fP, \fBLANG\fP
.B GREP_COLOR
Specifies the marker for highlighting.
.TP
\fBLC_ALL\fP, \fBLC_COLLATE\fP, \fBLANG\fP
These variables specify the
.B LC_MESSAGES
locale, which determines the language that
.B grep
uses for messages.
The locale is determined by the first of these variables that is set.
American English is used if none of these environment variables are set,
or if the message catalog is not installed, or if
.B grep
was not compiled with national language support (\s-1NLS\s0).
.B LC_COLLATE
locale, which determines the collating sequence used to interpret
range expressions like
.BR [a\-z] .
.TP
\fBLC_ALL\fP, \fBLC_CTYPE\fP, \fBLANG\fP
These variables specify the
.B LC_CTYPE
locale, which determines the type of characters, e.g., which
characters are whitespace.
The locale is determined by the first of these variables that is set.
The \s-1POSIX\s0 locale is used if none of these environment variables
are set, or if the locale catalog is not installed, or if
.TP
\fBLC_ALL\fP, \fBLC_MESSAGES\fP, \fBLANG\fP
These variables specify the
.B LC_MESSAGES
locale, which determines the language that
.B grep
was not compiled with national language support (\s-1NLS\s0).
uses for messages.
The default C locale uses American English messages.
.TP
.B POSIXLY_CORRECT
If set,
@ -618,13 +753,14 @@ when
is not set.
.SH DIAGNOSTICS
.PP
Normally, exit status is 0 if matches were found,
and 1 if no matches were found. (The
.B \-v
option inverts the sense of the exit status.)
Exit status is 2 if there were syntax errors
in the pattern, inaccessible input files, or
other system errors.
Normally, exit status is 0 if selected lines are found and 1 otherwise.
But the exit status is 2 if an error occurred, unless the
.B \-q
or
.B \-\^\-quiet
or
.B \-\^\-silent
option is used and a selected line is found.
.SH BUGS
.PP
Email bug reports to
@ -633,7 +769,7 @@ Be sure to include the word \*(lqgrep\*(rq somewhere in the
\*(lqSubject:\*(rq field.
.PP
Large repetition counts in the
.BI { m , n }
.BI { n , m }
construct may cause grep to use lots of memory.
In addition,
certain other obscure regular expressions require exponential time

File diff suppressed because it is too large Load Diff

View File

@ -1,5 +1,5 @@
/* grep.h - interface to grep driver for searching subroutines.
Copyright (C) 1992, 1998 Free Software Foundation, Inc.
Copyright (C) 1992, 1998, 2001 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
@ -20,20 +20,16 @@
# define __attribute__(x)
#endif
extern void fatal PARAMS ((const char *, int)) __attribute__((noreturn));
extern char *xmalloc PARAMS ((size_t size));
extern char *xrealloc PARAMS ((char *ptr, size_t size));
/* Grep.c expects the matchers vector to be terminated
by an entry with a NULL name, and to contain at least
by an entry with a NULL compile, and to contain at least
an entry named "default". */
extern struct matcher
{
char *name;
void (*compile) PARAMS ((char *, size_t));
char *(*execute) PARAMS ((char *, size_t, char **));
} matchers[];
char name[8];
void (*compile) PARAMS ((char const *, size_t));
size_t (*execute) PARAMS ((char const *, size_t, size_t *, int));
} const matchers[];
/* Exported from fgrepmat.c, egrepmat.c, grepmat.c. */
extern char const *matcher;

View File

@ -0,0 +1,85 @@
/* hard-locale.c -- Determine whether a locale is hard.
Copyright 1997, 1998, 1999 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */
#if HAVE_CONFIG_H
# include <config.h>
#endif
#ifndef __GNUC__
# ifdef HAVE_ALLOCA_H
# include <alloca.h>
# else
# ifdef _AIX
# pragma alloca
# else
# ifdef _WIN32
# include <malloc.h>
# include <io.h>
# else
# ifndef alloca
char *alloca ();
# endif
# endif
# endif
# endif
#endif
#if HAVE_LOCALE_H
# include <locale.h>
#endif
#if HAVE_STRING_H
# include <string.h>
#endif
/* Return nonzero if the current CATEGORY locale is hard, i.e. if you
can't get away with assuming traditional C or POSIX behavior. */
int
hard_locale (int category)
{
#if ! (defined ENABLE_NLS && HAVE_SETLOCALE)
return 0;
#else
int hard = 1;
char const *p = setlocale (category, 0);
if (p)
{
# if defined __GLIBC__ && __GLIBC__ >= 2
if (strcmp (p, "C") == 0 || strcmp (p, "POSIX") == 0)
hard = 0;
# else
char *locale = alloca (strlen (p) + 1);
strcpy (locale, p);
/* Temporarily set the locale to the "C" and "POSIX" locales to
find their names, so that we can determine whether one or the
other is the caller's locale. */
if (((p = setlocale (category, "C")) && strcmp (p, locale) == 0)
|| ((p = setlocale (category, "POSIX")) && strcmp (p, locale) == 0))
hard = 0;
/* Restore the caller's locale. */
setlocale (category, locale);
# endif
}
return hard;
#endif
}

View File

@ -0,0 +1,18 @@
#ifndef HARD_LOCALE_H_
# define HARD_LOCALE_H_ 1
# if HAVE_CONFIG_H
# include <config.h>
# endif
# ifndef PARAMS
# if defined PROTOTYPES || (defined __STDC__ && __STDC__)
# define PARAMS(Args) Args
# else
# define PARAMS(Args) ()
# endif
# endif
int hard_locale PARAMS ((int));
#endif /* HARD_LOCALE_H_ */

42
gnu/usr.bin/grep/isdir.c Normal file
View File

@ -0,0 +1,42 @@
/* isdir.c -- determine whether a directory exists
Copyright (C) 1990, 1998 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */
#if HAVE_CONFIG_H
# include <config.h>
#endif
#include <sys/types.h>
#include <sys/stat.h>
#if STAT_MACROS_BROKEN
# undef S_ISDIR
#endif
#if !defined S_ISDIR && defined S_IFDIR
# define S_ISDIR(Mode) (((Mode) & S_IFMT) == S_IFDIR)
#endif
/* If PATH is an existing directory or symbolic link to a directory,
return nonzero, else 0. */
int
isdir (const char *path)
{
struct stat stats;
return stat (path, &stats) == 0 && S_ISDIR (stats.st_mode);
}

View File

@ -81,22 +81,13 @@ struct kwset
struct trie *next[NCHAR]; /* Table of children of the root. */
char *target; /* Target string if there's only one. */
int mind2; /* Used in Boyer-Moore search for one string. */
char *trans; /* Character translation table. */
char const *trans; /* Character translation table. */
};
/* prototypes */
static void enqueue PARAMS((struct tree *, struct trie **));
static void treefails PARAMS((register struct tree *, struct trie *, struct trie *));
static void treedelta PARAMS((register struct tree *,register unsigned int, unsigned char *));
static int hasevery PARAMS((register struct tree *, register struct tree *));
static void treenext PARAMS((struct tree *, struct trie **));
static char * bmexec PARAMS((kwset_t, char *, size_t));
static char * cwexec PARAMS((kwset_t, char *, size_t, struct kwsmatch *));
/* Allocate and initialize a keyword set object, returning an opaque
pointer to it. Return NULL if memory is not available. */
kwset_t
kwsalloc (char *trans)
kwsalloc (char const *trans)
{
struct kwset *kwset;
@ -131,7 +122,7 @@ kwsalloc (char *trans)
/* Add the given string to the contents of the keyword set. Return NULL
for success, an error message otherwise. */
char *
kwsincr (kwset_t kws, char *text, size_t len)
kwsincr (kwset_t kws, char const *text, size_t len)
{
struct kwset *kwset;
register struct trie *trie;
@ -301,7 +292,8 @@ enqueue (struct tree *tree, struct trie **last)
from the given tree, given the failure function for their parent as
well as a last resort failure node. */
static void
treefails (register struct tree *tree, struct trie *fail, struct trie *recourse)
treefails (register struct tree const *tree, struct trie const *fail,
struct trie *recourse)
{
register struct tree *link;
@ -335,7 +327,7 @@ treefails (register struct tree *tree, struct trie *fail, struct trie *recourse)
/* Set delta entries for the links of the given tree such that
the preexisting delta value is larger than the current depth. */
static void
treedelta (register struct tree *tree,
treedelta (register struct tree const *tree,
register unsigned int depth,
unsigned char delta[])
{
@ -349,7 +341,7 @@ treedelta (register struct tree *tree,
/* Return true if A has every label in B. */
static int
hasevery (register struct tree *a, register struct tree *b)
hasevery (register struct tree const *a, register struct tree const *b)
{
if (!b)
return 1;
@ -368,7 +360,7 @@ hasevery (register struct tree *a, register struct tree *b)
/* Compute a vector, indexed by character code, of the trie nodes
referenced from the given tree. */
static void
treenext (struct tree *tree, struct trie *next[])
treenext (struct tree const *tree, struct trie *next[])
{
if (!tree)
return;
@ -385,7 +377,7 @@ kwsprep (kwset_t kws)
register struct kwset *kwset;
register int i;
register struct trie *curr, *fail;
register char *trans;
register char const *trans;
unsigned char delta[NCHAR];
struct trie *last, *next[NCHAR];
@ -497,23 +489,26 @@ kwsprep (kwset_t kws)
#define U(C) ((unsigned char) (C))
/* Fast boyer-moore search. */
static char *
bmexec (kwset_t kws, char *text, size_t size)
static size_t
bmexec (kwset_t kws, char const *text, size_t size)
{
struct kwset *kwset;
register unsigned char *d1;
register char *ep, *sp, *tp;
struct kwset const *kwset;
register unsigned char const *d1;
register char const *ep, *sp, *tp;
register int d, gc, i, len, md2;
kwset = (struct kwset *) kws;
kwset = (struct kwset const *) kws;
len = kwset->mind;
if (len == 0)
return text;
if (len > size)
return 0;
if (len > size)
return -1;
if (len == 1)
return memchr(text, kwset->target[0], size);
{
tp = memchr (text, kwset->target[0], size);
return tp ? tp - text : -1;
}
d1 = kwset->delta;
sp = kwset->target + len;
@ -552,7 +547,7 @@ bmexec (kwset_t kws, char *text, size_t size)
for (i = 3; i <= len && U(tp[-i]) == U(sp[-i]); ++i)
;
if (i > len)
return tp - len;
return tp - len - text;
}
tp += md2;
}
@ -571,26 +566,29 @@ bmexec (kwset_t kws, char *text, size_t size)
for (i = 3; i <= len && U(tp[-i]) == U(sp[-i]); ++i)
;
if (i > len)
return tp - len;
return tp - len - text;
}
d = md2;
}
return 0;
return -1;
}
/* Hairy multiple string search. */
static char *
cwexec (kwset_t kws, char *text, size_t len, struct kwsmatch *kwsmatch)
static size_t
cwexec (kwset_t kws, char const *text, size_t len, struct kwsmatch *kwsmatch)
{
struct kwset *kwset;
struct trie **next, *trie, *accept;
char *beg, *lim, *mch, *lmch;
register unsigned char c, *delta;
struct kwset const *kwset;
struct trie * const *next;
struct trie const *trie;
struct trie const *accept;
char const *beg, *lim, *mch, *lmch;
register unsigned char c;
register unsigned char const *delta;
register int d;
register char *end, *qlim;
register struct tree *tree;
register char *trans;
register char const *end, *qlim;
register struct tree const *tree;
register char const *trans;
#ifdef lint
accept = NULL;
@ -599,7 +597,7 @@ cwexec (kwset_t kws, char *text, size_t len, struct kwsmatch *kwsmatch)
/* Initialize register copies and look for easy ways out. */
kwset = (struct kwset *) kws;
if (len < kwset->mind)
return 0;
return -1;
next = kwset->next;
delta = kwset->delta;
trans = kwset->trans;
@ -668,7 +666,7 @@ cwexec (kwset_t kws, char *text, size_t len, struct kwsmatch *kwsmatch)
if (mch)
goto match;
}
return 0;
return -1;
match:
/* Given a known match, find the longest possible match anchored
@ -728,10 +726,10 @@ cwexec (kwset_t kws, char *text, size_t len, struct kwsmatch *kwsmatch)
if (kwsmatch)
{
kwsmatch->index = accept->accepting / 2;
kwsmatch->beg[0] = mch;
kwsmatch->offset[0] = mch - text;
kwsmatch->size[0] = accept->depth;
}
return mch;
return mch - text;
}
/* Search through the given text for a match of any member of the
@ -741,20 +739,18 @@ cwexec (kwset_t kws, char *text, size_t len, struct kwsmatch *kwsmatch)
matching substring. Similarly, if FOUNDIDX is non-NULL, store
in the referenced location the index number of the particular
keyword matched. */
char *
kwsexec (kwset_t kws, char *text, size_t size, struct kwsmatch *kwsmatch)
size_t
kwsexec (kwset_t kws, char const *text, size_t size,
struct kwsmatch *kwsmatch)
{
struct kwset *kwset;
char *ret;
kwset = (struct kwset *) kws;
struct kwset const *kwset = (struct kwset *) kws;
if (kwset->words == 1 && kwset->trans == 0)
{
ret = bmexec(kws, text, size);
if (kwsmatch != 0 && ret != 0)
size_t ret = bmexec (kws, text, size);
if (kwsmatch != 0 && ret != (size_t) -1)
{
kwsmatch->index = 0;
kwsmatch->beg[0] = ret;
kwsmatch->offset[0] = ret;
kwsmatch->size[0] = kwset->mind;
}
return ret;

View File

@ -23,7 +23,7 @@
struct kwsmatch
{
int index; /* Index number of matching keyword. */
char *beg[1]; /* Begin pointer for each submatch. */
size_t offset[1]; /* Offset of each submatch. */
size_t size[1]; /* Length of each submatch. */
};
@ -33,12 +33,12 @@ typedef ptr_t kwset_t;
if enough memory cannot be obtained. The argument if non-NULL
specifies a table of character translations to be applied to all
pattern and search text. */
extern kwset_t kwsalloc PARAMS((char *));
extern kwset_t kwsalloc PARAMS((char const *));
/* Incrementally extend the keyword set to include the given string.
Return NULL for success, or an error message. Remember an index
number for each keyword included in the set. */
extern char *kwsincr PARAMS((kwset_t, char *, size_t));
extern char *kwsincr PARAMS((kwset_t, char const *, size_t));
/* When the keyword set has been completely built, prepare it for
use. Return NULL for success, or an error message. */
@ -50,7 +50,7 @@ extern char *kwsprep PARAMS((kwset_t));
the matching substring in the integer it points to. Similarly,
if foundindex is non-NULL, store the index of the particular
keyword found therein. */
extern char *kwsexec PARAMS((kwset_t, char *, size_t, struct kwsmatch *));
extern size_t kwsexec PARAMS((kwset_t, char const *, size_t, struct kwsmatch *));
/* Deallocate the given keyword set and all its associated storage. */
extern void kwsfree PARAMS((kwset_t));

613
gnu/usr.bin/grep/quotearg.c Normal file
View File

@ -0,0 +1,613 @@
/* quotearg.c - quote arguments for output
Copyright (C) 1998, 1999, 2000, 2001 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */
/* Written by Paul Eggert <eggert@twinsun.com> */
#if HAVE_CONFIG_H
# include <config.h>
#endif
#if HAVE_STDDEF_H
# include <stddef.h> /* For the definition of size_t on windows w/MSVC. */
#endif
#include <sys/types.h>
#include <quotearg.h>
#include <xalloc.h>
#include <ctype.h>
#if ENABLE_NLS
# include <libintl.h>
# define _(text) gettext (text)
#else
# define _(text) text
#endif
#define N_(text) text
#if HAVE_LIMITS_H
# include <limits.h>
#endif
#ifndef CHAR_BIT
# define CHAR_BIT 8
#endif
#ifndef UCHAR_MAX
# define UCHAR_MAX ((unsigned char) -1)
#endif
#if HAVE_C_BACKSLASH_A
# define ALERT_CHAR '\a'
#else
# define ALERT_CHAR '\7'
#endif
#if HAVE_STDLIB_H
# include <stdlib.h>
#endif
#if HAVE_STRING_H
# include <string.h>
#endif
#if HAVE_WCHAR_H
# include <wchar.h>
#endif
#if !HAVE_MBRTOWC
/* Disable multibyte processing entirely. Since MB_CUR_MAX is 1, the
other macros are defined only for documentation and to satisfy C
syntax. */
# undef MB_CUR_MAX
# define MB_CUR_MAX 1
# define mbrtowc(pwc, s, n, ps) ((*(pwc) = *(s)) != 0)
# define mbsinit(ps) 1
# define iswprint(wc) ISPRINT ((unsigned char) (wc))
#endif
#ifndef iswprint
# if HAVE_WCTYPE_H
# include <wctype.h>
# endif
# if !defined iswprint && !HAVE_ISWPRINT
# define iswprint(wc) 1
# endif
#endif
#define INT_BITS (sizeof (int) * CHAR_BIT)
#if defined (STDC_HEADERS) || (!defined (isascii) && !defined (HAVE_ISASCII))
# define IN_CTYPE_DOMAIN(c) 1
#else
# define IN_CTYPE_DOMAIN(c) isascii(c)
#endif
/* Undefine to protect against the definition in wctype.h of solaris2.6. */
#undef ISPRINT
#define ISPRINT(c) (IN_CTYPE_DOMAIN (c) && isprint (c))
struct quoting_options
{
/* Basic quoting style. */
enum quoting_style style;
/* Quote the characters indicated by this bit vector even if the
quoting style would not normally require them to be quoted. */
int quote_these_too[(UCHAR_MAX / INT_BITS) + 1];
};
/* Names of quoting styles. */
char const *const quoting_style_args[] =
{
"literal",
"shell",
"shell-always",
"c",
"escape",
"locale",
"clocale",
0
};
/* Correspondences to quoting style names. */
enum quoting_style const quoting_style_vals[] =
{
literal_quoting_style,
shell_quoting_style,
shell_always_quoting_style,
c_quoting_style,
escape_quoting_style,
locale_quoting_style,
clocale_quoting_style
};
/* The default quoting options. */
static struct quoting_options default_quoting_options;
/* Allocate a new set of quoting options, with contents initially identical
to O if O is not null, or to the default if O is null.
It is the caller's responsibility to free the result. */
struct quoting_options *
clone_quoting_options (struct quoting_options *o)
{
struct quoting_options *p
= (struct quoting_options *) xmalloc (sizeof (struct quoting_options));
*p = *(o ? o : &default_quoting_options);
return p;
}
/* Get the value of O's quoting style. If O is null, use the default. */
enum quoting_style
get_quoting_style (struct quoting_options *o)
{
return (o ? o : &default_quoting_options)->style;
}
/* In O (or in the default if O is null),
set the value of the quoting style to S. */
void
set_quoting_style (struct quoting_options *o, enum quoting_style s)
{
(o ? o : &default_quoting_options)->style = s;
}
/* In O (or in the default if O is null),
set the value of the quoting options for character C to I.
Return the old value. Currently, the only values defined for I are
0 (the default) and 1 (which means to quote the character even if
it would not otherwise be quoted). */
int
set_char_quoting (struct quoting_options *o, char c, int i)
{
unsigned char uc = c;
int *p = (o ? o : &default_quoting_options)->quote_these_too + uc / INT_BITS;
int shift = uc % INT_BITS;
int r = (*p >> shift) & 1;
*p ^= ((i & 1) ^ r) << shift;
return r;
}
/* MSGID approximates a quotation mark. Return its translation if it
has one; otherwise, return either it or "\"", depending on S. */
static char const *
gettext_quote (char const *msgid, enum quoting_style s)
{
char const *translation = _(msgid);
if (translation == msgid && s == clocale_quoting_style)
translation = "\"";
return translation;
}
/* Place into buffer BUFFER (of size BUFFERSIZE) a quoted version of
argument ARG (of size ARGSIZE), using QUOTING_STYLE and the
non-quoting-style part of O to control quoting.
Terminate the output with a null character, and return the written
size of the output, not counting the terminating null.
If BUFFERSIZE is too small to store the output string, return the
value that would have been returned had BUFFERSIZE been large enough.
If ARGSIZE is -1, use the string length of the argument for ARGSIZE.
This function acts like quotearg_buffer (BUFFER, BUFFERSIZE, ARG,
ARGSIZE, O), except it uses QUOTING_STYLE instead of the quoting
style specified by O, and O may not be null. */
static size_t
quotearg_buffer_restyled (char *buffer, size_t buffersize,
char const *arg, size_t argsize,
enum quoting_style quoting_style,
struct quoting_options const *o)
{
size_t i;
size_t len = 0;
char const *quote_string = 0;
size_t quote_string_len = 0;
int backslash_escapes = 0;
int unibyte_locale = MB_CUR_MAX == 1;
#define STORE(c) \
do \
{ \
if (len < buffersize) \
buffer[len] = (c); \
len++; \
} \
while (0)
switch (quoting_style)
{
case c_quoting_style:
STORE ('"');
backslash_escapes = 1;
quote_string = "\"";
quote_string_len = 1;
break;
case escape_quoting_style:
backslash_escapes = 1;
break;
case locale_quoting_style:
case clocale_quoting_style:
{
/* Get translations for open and closing quotation marks.
The message catalog should translate "`" to a left
quotation mark suitable for the locale, and similarly for
"'". If the catalog has no translation,
locale_quoting_style quotes `like this', and
clocale_quoting_style quotes "like this".
For example, an American English Unicode locale should
translate "`" to U+201C (LEFT DOUBLE QUOTATION MARK), and
should translate "'" to U+201D (RIGHT DOUBLE QUOTATION
MARK). A British English Unicode locale should instead
translate these to U+2018 (LEFT SINGLE QUOTATION MARK) and
U+2019 (RIGHT SINGLE QUOTATION MARK), respectively. */
char const *left = gettext_quote (N_("`"), quoting_style);
char const *right = gettext_quote (N_("'"), quoting_style);
for (quote_string = left; *quote_string; quote_string++)
STORE (*quote_string);
backslash_escapes = 1;
quote_string = right;
quote_string_len = strlen (quote_string);
}
break;
case shell_always_quoting_style:
STORE ('\'');
quote_string = "'";
quote_string_len = 1;
break;
default:
break;
}
for (i = 0; ! (argsize == (size_t) -1 ? arg[i] == '\0' : i == argsize); i++)
{
unsigned char c;
unsigned char esc;
if (backslash_escapes
&& quote_string_len
&& i + quote_string_len <= argsize
&& memcmp (arg + i, quote_string, quote_string_len) == 0)
STORE ('\\');
c = arg[i];
switch (c)
{
case '?':
switch (quoting_style)
{
case shell_quoting_style:
goto use_shell_always_quoting_style;
case c_quoting_style:
if (i + 2 < argsize && arg[i + 1] == '?')
switch (arg[i + 2])
{
case '!': case '\'':
case '(': case ')': case '-': case '/':
case '<': case '=': case '>':
/* Escape the second '?' in what would otherwise be
a trigraph. */
i += 2;
c = arg[i + 2];
STORE ('?');
STORE ('\\');
STORE ('?');
break;
}
break;
default:
break;
}
break;
case ALERT_CHAR: esc = 'a'; goto c_escape;
case '\b': esc = 'b'; goto c_escape;
case '\f': esc = 'f'; goto c_escape;
case '\n': esc = 'n'; goto c_and_shell_escape;
case '\r': esc = 'r'; goto c_and_shell_escape;
case '\t': esc = 't'; goto c_and_shell_escape;
case '\v': esc = 'v'; goto c_escape;
case '\\': esc = c; goto c_and_shell_escape;
c_and_shell_escape:
if (quoting_style == shell_quoting_style)
goto use_shell_always_quoting_style;
c_escape:
if (backslash_escapes)
{
c = esc;
goto store_escape;
}
break;
case '#': case '~':
if (i != 0)
break;
/* Fall through. */
case ' ':
case '!': /* special in bash */
case '"': case '$': case '&':
case '(': case ')': case '*': case ';':
case '<': case '>': case '[':
case '^': /* special in old /bin/sh, e.g. SunOS 4.1.4 */
case '`': case '|':
/* A shell special character. In theory, '$' and '`' could
be the first bytes of multibyte characters, which means
we should check them with mbrtowc, but in practice this
doesn't happen so it's not worth worrying about. */
if (quoting_style == shell_quoting_style)
goto use_shell_always_quoting_style;
break;
case '\'':
switch (quoting_style)
{
case shell_quoting_style:
goto use_shell_always_quoting_style;
case shell_always_quoting_style:
STORE ('\'');
STORE ('\\');
STORE ('\'');
break;
default:
break;
}
break;
case '%': case '+': case ',': case '-': case '.': case '/':
case '0': case '1': case '2': case '3': case '4': case '5':
case '6': case '7': case '8': case '9': case ':': case '=':
case 'A': case 'B': case 'C': case 'D': case 'E': case 'F':
case 'G': case 'H': case 'I': case 'J': case 'K': case 'L':
case 'M': case 'N': case 'O': case 'P': case 'Q': case 'R':
case 'S': case 'T': case 'U': case 'V': case 'W': case 'X':
case 'Y': case 'Z': case ']': case '_': case 'a': case 'b':
case 'c': case 'd': case 'e': case 'f': case 'g': case 'h':
case 'i': case 'j': case 'k': case 'l': case 'm': case 'n':
case 'o': case 'p': case 'q': case 'r': case 's': case 't':
case 'u': case 'v': case 'w': case 'x': case 'y': case 'z':
case '{': case '}':
/* These characters don't cause problems, no matter what the
quoting style is. They cannot start multibyte sequences. */
break;
default:
/* If we have a multibyte sequence, copy it until we reach
its end, find an error, or come back to the initial shift
state. For C-like styles, if the sequence has
unprintable characters, escape the whole sequence, since
we can't easily escape single characters within it. */
{
/* Length of multibyte sequence found so far. */
size_t m;
int printable;
if (unibyte_locale)
{
m = 1;
printable = ISPRINT (c);
}
else
{
mbstate_t mbstate;
memset (&mbstate, 0, sizeof mbstate);
m = 0;
printable = 1;
if (argsize == (size_t) -1)
argsize = strlen (arg);
do
{
wchar_t w;
size_t bytes = mbrtowc (&w, &arg[i + m],
argsize - (i + m), &mbstate);
if (bytes == 0)
break;
else if (bytes == (size_t) -1)
{
printable = 0;
break;
}
else if (bytes == (size_t) -2)
{
printable = 0;
while (i + m < argsize && arg[i + m])
m++;
break;
}
else
{
if (! iswprint (w))
printable = 0;
m += bytes;
}
}
while (! mbsinit (&mbstate));
}
if (1 < m || (backslash_escapes && ! printable))
{
/* Output a multibyte sequence, or an escaped
unprintable unibyte character. */
size_t ilim = i + m;
for (;;)
{
if (backslash_escapes && ! printable)
{
STORE ('\\');
STORE ('0' + (c >> 6));
STORE ('0' + ((c >> 3) & 7));
c = '0' + (c & 7);
}
if (ilim <= i + 1)
break;
STORE (c);
c = arg[++i];
}
goto store_c;
}
}
}
if (! (backslash_escapes
&& o->quote_these_too[c / INT_BITS] & (1 << (c % INT_BITS))))
goto store_c;
store_escape:
STORE ('\\');
store_c:
STORE (c);
}
if (quote_string)
for (; *quote_string; quote_string++)
STORE (*quote_string);
if (len < buffersize)
buffer[len] = '\0';
return len;
use_shell_always_quoting_style:
return quotearg_buffer_restyled (buffer, buffersize, arg, argsize,
shell_always_quoting_style, o);
}
/* Place into buffer BUFFER (of size BUFFERSIZE) a quoted version of
argument ARG (of size ARGSIZE), using O to control quoting.
If O is null, use the default.
Terminate the output with a null character, and return the written
size of the output, not counting the terminating null.
If BUFFERSIZE is too small to store the output string, return the
value that would have been returned had BUFFERSIZE been large enough.
If ARGSIZE is -1, use the string length of the argument for ARGSIZE. */
size_t
quotearg_buffer (char *buffer, size_t buffersize,
char const *arg, size_t argsize,
struct quoting_options const *o)
{
struct quoting_options const *p = o ? o : &default_quoting_options;
return quotearg_buffer_restyled (buffer, buffersize, arg, argsize,
p->style, p);
}
/* Use storage slot N to return a quoted version of the string ARG.
OPTIONS specifies the quoting options.
The returned value points to static storage that can be
reused by the next call to this function with the same value of N.
N must be nonnegative. N is deliberately declared with type "int"
to allow for future extensions (using negative values). */
static char *
quotearg_n_options (int n, char const *arg,
struct quoting_options const *options)
{
/* Preallocate a slot 0 buffer, so that the caller can always quote
one small component of a "memory exhausted" message in slot 0. */
static char slot0[256];
static unsigned int nslots = 1;
struct slotvec
{
size_t size;
char *val;
};
static struct slotvec slotvec0 = {sizeof slot0, slot0};
static struct slotvec *slotvec = &slotvec0;
if (nslots <= n)
{
int n1 = n + 1;
size_t s = n1 * sizeof (struct slotvec);
if (! (0 < n1 && n1 == s / sizeof (struct slotvec)))
abort ();
if (slotvec == &slotvec0)
{
slotvec = (struct slotvec *) xmalloc (sizeof (struct slotvec));
*slotvec = slotvec0;
}
slotvec = (struct slotvec *) xrealloc (slotvec, s);
memset (slotvec + nslots, 0, (n1 - nslots) * sizeof (struct slotvec));
nslots = n;
}
{
size_t size = slotvec[n].size;
char *val = slotvec[n].val;
size_t qsize = quotearg_buffer (val, size, arg, (size_t) -1, options);
if (size <= qsize)
{
slotvec[n].size = size = qsize + 1;
slotvec[n].val = val = xrealloc (val == slot0 ? 0 : val, size);
quotearg_buffer (val, size, arg, (size_t) -1, options);
}
return val;
}
}
char *
quotearg_n (unsigned int n, char const *arg)
{
return quotearg_n_options (n, arg, &default_quoting_options);
}
char *
quotearg (char const *arg)
{
return quotearg_n (0, arg);
}
char *
quotearg_n_style (unsigned int n, enum quoting_style s, char const *arg)
{
struct quoting_options o;
o.style = s;
memset (o.quote_these_too, 0, sizeof o.quote_these_too);
return quotearg_n_options (n, arg, &o);
}
char *
quotearg_style (enum quoting_style s, char const *arg)
{
return quotearg_n_style (0, s, arg);
}
char *
quotearg_char (char const *arg, char ch)
{
struct quoting_options options;
options = default_quoting_options;
set_char_quoting (&options, ch, 1);
return quotearg_n_options (0, arg, &options);
}
char *
quotearg_colon (char const *arg)
{
return quotearg_char (arg, ':');
}

110
gnu/usr.bin/grep/quotearg.h Normal file
View File

@ -0,0 +1,110 @@
/* quotearg.h - quote arguments for output
Copyright (C) 1998, 1999, 2000 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */
/* Written by Paul Eggert <eggert@twinsun.com> */
/* Basic quoting styles. */
enum quoting_style
{
literal_quoting_style, /* --quoting-style=literal */
shell_quoting_style, /* --quoting-style=shell */
shell_always_quoting_style, /* --quoting-style=shell-always */
c_quoting_style, /* --quoting-style=c */
escape_quoting_style, /* --quoting-style=escape */
locale_quoting_style, /* --quoting-style=locale */
clocale_quoting_style /* --quoting-style=clocale */
};
/* For now, --quoting-style=literal is the default, but this may change. */
#ifndef DEFAULT_QUOTING_STYLE
# define DEFAULT_QUOTING_STYLE literal_quoting_style
#endif
/* Names of quoting styles and their corresponding values. */
extern char const *const quoting_style_args[];
extern enum quoting_style const quoting_style_vals[];
struct quoting_options;
#ifndef PARAMS
# if defined PROTOTYPES || defined __STDC__
# define PARAMS(Args) Args
# else
# define PARAMS(Args) ()
# endif
#endif
/* The functions listed below set and use a hidden variable
that contains the default quoting style options. */
/* Allocate a new set of quoting options, with contents initially identical
to O if O is not null, or to the default if O is null.
It is the caller's responsibility to free the result. */
struct quoting_options *clone_quoting_options
PARAMS ((struct quoting_options *o));
/* Get the value of O's quoting style. If O is null, use the default. */
enum quoting_style get_quoting_style PARAMS ((struct quoting_options *o));
/* In O (or in the default if O is null),
set the value of the quoting style to S. */
void set_quoting_style PARAMS ((struct quoting_options *o,
enum quoting_style s));
/* In O (or in the default if O is null),
set the value of the quoting options for character C to I.
Return the old value. Currently, the only values defined for I are
0 (the default) and 1 (which means to quote the character even if
it would not otherwise be quoted). */
int set_char_quoting PARAMS ((struct quoting_options *o, char c, int i));
/* Place into buffer BUFFER (of size BUFFERSIZE) a quoted version of
argument ARG (of size ARGSIZE), using O to control quoting.
If O is null, use the default.
Terminate the output with a null character, and return the written
size of the output, not counting the terminating null.
If BUFFERSIZE is too small to store the output string, return the
value that would have been returned had BUFFERSIZE been large enough.
If ARGSIZE is -1, use the string length of the argument for ARGSIZE. */
size_t quotearg_buffer PARAMS ((char *buffer, size_t buffersize,
char const *arg, size_t argsize,
struct quoting_options const *o));
/* Use storage slot N to return a quoted version of the string ARG.
Use the default quoting options.
The returned value points to static storage that can be
reused by the next call to this function with the same value of N.
N must be nonnegative. */
char *quotearg_n PARAMS ((unsigned int n, char const *arg));
/* Equivalent to quotearg_n (0, ARG). */
char *quotearg PARAMS ((char const *arg));
/* Use style S and storage slot N to return a quoted version of the string ARG.
This is like quotearg_n (N, ARG), except that it uses S with no other
options to specify the quoting method. */
char *quotearg_n_style PARAMS ((unsigned int n, enum quoting_style s,
char const *arg));
/* Equivalent to quotearg_n_style (0, S, ARG). */
char *quotearg_style PARAMS ((enum quoting_style s, char const *arg));
/* Like quotearg (ARG), except also quote any instances of CH. */
char *quotearg_char PARAMS ((char const *arg, char ch));
/* Equivalent to quotearg_char (ARG, ':'). */
char *quotearg_colon PARAMS ((char const *arg));

View File

@ -1,5 +1,5 @@
/* savedir.c -- save the list of files in a directory in a string
Copyright (C) 1990, 1997, 1998, 1999, 2000 Free Software Foundation, Inc.
Copyright (C) 1990, 1997, 1998, 1999, 2000, 2001 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
@ -66,17 +66,41 @@ char *realloc ();
char *stpcpy ();
#endif
#include <fnmatch.h>
#include "savedir.h"
char *path;
size_t pathlen;
static int
isdir1 (const char *dir, const char *file)
{
int status;
int slash;
size_t dirlen = strlen (dir);
size_t filelen = strlen (file);
if ((dirlen + filelen + 2) > pathlen)
{
path = calloc (dirlen + 1 + filelen + 1, sizeof (*path));
pathlen = dirlen + filelen + 2;
}
strcpy (path, dir);
slash = (path[dirlen] != '/');
path[dirlen] = '/';
strcpy (path + dirlen + slash , file);
status = isdir (path);
return status;
}
/* Return a freshly allocated string containing the filenames
in directory DIR, separated by '\0' characters;
the end is marked by two '\0' characters in a row.
NAME_SIZE is the number of bytes to initially allocate
for the string; it will be enlarged as needed.
Return NULL if DIR cannot be opened or if out of memory. */
char *
savedir (const char *dir, off_t name_size)
savedir (const char *dir, off_t name_size, struct exclude *included_patterns,
struct exclude *excluded_patterns)
{
DIR *dirp;
struct dirent *dp;
@ -109,6 +133,17 @@ savedir (const char *dir, off_t name_size)
{
off_t size_needed = (namep - name_space) + NAMLEN (dp) + 2;
if ((included_patterns || excluded_patterns)
&& !isdir1 (dir, dp->d_name))
{
if (included_patterns
&& !excluded_filename (included_patterns, dp->d_name, 0))
continue;
if (excluded_patterns
&& excluded_filename (excluded_patterns, dp->d_name, 0))
continue;
}
if (size_needed > name_size)
{
char *new_name_space;
@ -134,5 +169,11 @@ savedir (const char *dir, off_t name_size)
free (name_space);
return NULL;
}
if (path)
{
free (path);
path = NULL;
pathlen = 0;
}
return name_space;
}

View File

@ -1,6 +1,8 @@
#if !defined SAVEDIR_H_
# define SAVEDIR_H_
#include "exclude.h"
# ifndef PARAMS
# if defined PROTOTYPES || (defined __STDC__ && __STDC__)
# define PARAMS(Args) Args
@ -9,7 +11,8 @@
# endif
# endif
char *
savedir PARAMS ((const char *dir, off_t name_size));
extern char *
savedir PARAMS ((const char *dir, off_t name_size,
struct exclude *, struct exclude *));
#endif

View File

@ -22,54 +22,71 @@
# include <config.h>
#endif
#include <sys/types.h>
#if defined HAVE_WCTYPE_H && defined HAVE_WCHAR_H && defined HAVE_MBRTOWC
/* We can handle multibyte string. */
# define MBS_SUPPORT
# include <wchar.h>
# include <wctype.h>
#endif
#include "system.h"
#include "grep.h"
#include "regex.h"
#include "dfa.h"
#include "kwset.h"
#include "error.h"
#include "xalloc.h"
#ifdef HAVE_LIBPCRE
# include <pcre.h>
#endif
#define NCHAR (UCHAR_MAX + 1)
static void Gcompile PARAMS((char *, size_t));
static void Ecompile PARAMS((char *, size_t));
static char *EGexecute PARAMS((char *, size_t, char **));
static void Fcompile PARAMS((char *, size_t));
static char *Fexecute PARAMS((char *, size_t, char **));
static void kwsinit PARAMS((void));
/* Here is the matchers vector for the main program. */
struct matcher matchers[] = {
{ "default", Gcompile, EGexecute },
{ "grep", Gcompile, EGexecute },
{ "egrep", Ecompile, EGexecute },
{ "awk", Ecompile, EGexecute },
{ "fgrep", Fcompile, Fexecute },
{ 0, 0, 0 },
};
/* For -w, we also consider _ to be word constituent. */
#define WCHAR(C) (ISALNUM(C) || (C) == '_')
/* DFA compiled regexp. */
static struct dfa dfa;
/* Regex compiled regexp. */
static struct re_pattern_buffer regexbuf;
/* The Regex compiled patterns. */
static struct patterns
{
/* Regex compiled regexp. */
struct re_pattern_buffer regexbuf;
struct re_registers regs; /* This is here on account of a BRAIN-DEAD
Q@#%!# library interface in regex.c. */
} patterns0;
struct patterns *patterns;
size_t pcount;
/* KWset compiled pattern. For Ecompile and Gcompile, we compile
a list of strings, at least one of which is known to occur in
any string matching the regexp. */
static kwset_t kwset;
/* Last compiled fixed string known to exactly match the regexp.
If kwsexec() returns < lastexact, then we don't need to
/* Number of compiled fixed strings known to exactly match the regexp.
If kwsexec returns < kwset_exact_matches, then we don't need to
call the regexp matcher at all. */
static int lastexact;
static int kwset_exact_matches;
#if defined(MBS_SUPPORT)
static char* check_multibyte_string PARAMS ((char const *buf, size_t size));
#endif
static void kwsinit PARAMS ((void));
static void kwsmusts PARAMS ((void));
static void Gcompile PARAMS ((char const *, size_t));
static void Ecompile PARAMS ((char const *, size_t));
static size_t EGexecute PARAMS ((char const *, size_t, size_t *, int ));
static void Fcompile PARAMS ((char const *, size_t));
static size_t Fexecute PARAMS ((char const *, size_t, size_t *, int));
static void Pcompile PARAMS ((char const *, size_t ));
static size_t Pexecute PARAMS ((char const *, size_t, size_t *, int));
void
dfaerror (char const *mesg)
{
fatal(mesg, 0);
error (2, 0, mesg);
}
static void
@ -80,10 +97,10 @@ kwsinit (void)
if (match_icase)
for (i = 0; i < NCHAR; ++i)
trans[i] = TOLOWER(i);
trans[i] = TOLOWER (i);
if (!(kwset = kwsalloc(match_icase ? trans : (char *) 0)))
fatal("memory exhausted", 0);
if (!(kwset = kwsalloc (match_icase ? trans : (char *) 0)))
error (2, 0, _("memory exhausted"));
}
/* If the DFA turns out to have some set of fixed strings one of
@ -93,12 +110,12 @@ kwsinit (void)
static void
kwsmusts (void)
{
struct dfamust *dm;
char *err;
struct dfamust const *dm;
char const *err;
if (dfa.musts)
{
kwsinit();
kwsinit ();
/* First, we compile in the substrings known to be exact
matches. The kwset matcher will return the index
of the matching string that it chooses. */
@ -106,9 +123,9 @@ kwsmusts (void)
{
if (!dm->exact)
continue;
++lastexact;
if ((err = kwsincr(kwset, dm->must, strlen(dm->must))) != 0)
fatal(err, 0);
++kwset_exact_matches;
if ((err = kwsincr (kwset, dm->must, strlen (dm->must))) != 0)
error (2, 0, err);
}
/* Now, we compile the substrings that will require
the use of the regexp matcher. */
@ -116,24 +133,90 @@ kwsmusts (void)
{
if (dm->exact)
continue;
if ((err = kwsincr(kwset, dm->must, strlen(dm->must))) != 0)
fatal(err, 0);
if ((err = kwsincr (kwset, dm->must, strlen (dm->must))) != 0)
error (2, 0, err);
}
if ((err = kwsprep(kwset)) != 0)
fatal(err, 0);
if ((err = kwsprep (kwset)) != 0)
error (2, 0, err);
}
}
#ifdef MBS_SUPPORT
/* This function allocate the array which correspond to "buf".
Then this check multibyte string and mark on the positions which
are not singlebyte character nor the first byte of a multibyte
character. Caller must free the array. */
static char*
check_multibyte_string(char const *buf, size_t size)
{
char *mb_properties = malloc(size);
mbstate_t cur_state;
int i;
memset(&cur_state, 0, sizeof(mbstate_t));
memset(mb_properties, 0, sizeof(char)*size);
for (i = 0; i < size ;)
{
size_t mbclen;
mbclen = mbrlen(buf + i, size - i, &cur_state);
if (mbclen == (size_t) -1 || mbclen == (size_t) -2 || mbclen == 0)
{
/* An invalid sequence, or a truncated multibyte character.
We treat it as a singlebyte character. */
mbclen = 1;
}
mb_properties[i] = mbclen;
i += mbclen;
}
return mb_properties;
}
#endif
static void
Gcompile (char *pattern, size_t size)
Gcompile (char const *pattern, size_t size)
{
const char *err;
char const *sep;
size_t total = size;
char const *motif = pattern;
re_set_syntax(RE_SYNTAX_GREP | RE_HAT_LISTS_NOT_NEWLINE);
dfasyntax(RE_SYNTAX_GREP | RE_HAT_LISTS_NOT_NEWLINE, match_icase, eolbyte);
re_set_syntax (RE_SYNTAX_GREP | RE_HAT_LISTS_NOT_NEWLINE);
dfasyntax (RE_SYNTAX_GREP | RE_HAT_LISTS_NOT_NEWLINE, match_icase, eolbyte);
if ((err = re_compile_pattern(pattern, size, &regexbuf)) != 0)
fatal(err, 0);
/* For GNU regex compiler we have to pass the patterns separately to detect
errors like "[\nallo\n]\n". The patterns here are "[", "allo" and "]"
GNU regex should have raise a syntax error. The same for backref, where
the backref should have been local to each pattern. */
do
{
size_t len;
sep = memchr (motif, '\n', total);
if (sep)
{
len = sep - motif;
sep++;
total -= (len + 1);
}
else
{
len = total;
total = 0;
}
patterns = realloc (patterns, (pcount + 1) * sizeof (*patterns));
if (patterns == NULL)
error (2, errno, _("memory exhausted"));
patterns[pcount] = patterns0;
if ((err = re_compile_pattern (motif, len,
&(patterns[pcount].regexbuf))) != 0)
error (2, 0, err);
pcount++;
motif = sep;
} while (sep && total != 0);
/* In the match_words and match_lines cases, we use a different pattern
for the DFA matcher that will quickly throw out cases that won't work.
@ -142,49 +225,42 @@ Gcompile (char *pattern, size_t size)
if (match_words || match_lines)
{
/* In the whole-word case, we use the pattern:
(^|[^A-Za-z_])(userpattern)([^A-Za-z_]|$).
\(^\|[^[:alnum:]_]\)\(userpattern\)\([^[:alnum:]_]|$\).
In the whole-line case, we use the pattern:
^(userpattern)$.
BUG: Using [A-Za-z_] is locale-dependent!
So will use [:alnum:] */
^\(userpattern\)$. */
char *n = malloc(size + 50);
int i = 0;
strcpy(n, "");
if (match_lines)
strcpy(n, "^\\(");
if (match_words)
strcpy(n, "\\(^\\|[^[:alnum:]_]\\)\\(");
i = strlen(n);
memcpy(n + i, pattern, size);
static char const line_beg[] = "^\\(";
static char const line_end[] = "\\)$";
static char const word_beg[] = "\\(^\\|[^[:alnum:]_]\\)\\(";
static char const word_end[] = "\\)\\([^[:alnum:]_]\\|$\\)";
char *n = malloc (sizeof word_beg - 1 + size + sizeof word_end);
size_t i;
strcpy (n, match_lines ? line_beg : word_beg);
i = strlen (n);
memcpy (n + i, pattern, size);
i += size;
if (match_words)
strcpy(n + i, "\\)\\([^[:alnum:]_]\\|$\\)");
if (match_lines)
strcpy(n + i, "\\)$");
i += strlen(n + i);
dfacomp(n, i, &dfa, 1);
strcpy (n + i, match_lines ? line_end : word_end);
i += strlen (n + i);
pattern = n;
size = i;
}
else
dfacomp(pattern, size, &dfa, 1);
kwsmusts();
dfacomp (pattern, size, &dfa, 1);
kwsmusts ();
}
static void
Ecompile (char *pattern, size_t size)
Ecompile (char const *pattern, size_t size)
{
const char *err;
const char *sep;
size_t total = size;
char const *motif = pattern;
if (strcmp(matcher, "awk") == 0)
if (strcmp (matcher, "awk") == 0)
{
re_set_syntax(RE_SYNTAX_AWK);
dfasyntax(RE_SYNTAX_AWK, match_icase, eolbyte);
re_set_syntax (RE_SYNTAX_AWK);
dfasyntax (RE_SYNTAX_AWK, match_icase, eolbyte);
}
else
{
@ -192,8 +268,38 @@ Ecompile (char *pattern, size_t size)
dfasyntax (RE_SYNTAX_POSIX_EGREP, match_icase, eolbyte);
}
if ((err = re_compile_pattern(pattern, size, &regexbuf)) != 0)
fatal(err, 0);
/* For GNU regex compiler we have to pass the patterns separately to detect
errors like "[\nallo\n]\n". The patterns here are "[", "allo" and "]"
GNU regex should have raise a syntax error. The same for backref, where
the backref should have been local to each pattern. */
do
{
size_t len;
sep = memchr (motif, '\n', total);
if (sep)
{
len = sep - motif;
sep++;
total -= (len + 1);
}
else
{
len = total;
total = 0;
}
patterns = realloc (patterns, (pcount + 1) * sizeof (*patterns));
if (patterns == NULL)
error (2, errno, _("memory exhausted"));
patterns[pcount] = patterns0;
if ((err = re_compile_pattern (motif, len,
&(patterns[pcount].regexbuf))) != 0)
error (2, 0, err);
pcount++;
motif = sep;
} while (sep && total != 0);
/* In the match_words and match_lines cases, we use a different pattern
for the DFA matcher that will quickly throw out cases that won't work.
@ -202,186 +308,236 @@ Ecompile (char *pattern, size_t size)
if (match_words || match_lines)
{
/* In the whole-word case, we use the pattern:
(^|[^A-Za-z_])(userpattern)([^A-Za-z_]|$).
(^|[^[:alnum:]_])(userpattern)([^[:alnum:]_]|$).
In the whole-line case, we use the pattern:
^(userpattern)$.
BUG: Using [A-Za-z_] is locale-dependent!
so will use the char class */
char *n = malloc(size + 50);
int i = 0;
strcpy(n, "");
if (match_lines)
strcpy(n, "^(");
if (match_words)
strcpy(n, "(^|[^[:alnum:]_])(");
^(userpattern)$. */
static char const line_beg[] = "^(";
static char const line_end[] = ")$";
static char const word_beg[] = "(^|[^[:alnum:]_])(";
static char const word_end[] = ")([^[:alnum:]_]|$)";
char *n = malloc (sizeof word_beg - 1 + size + sizeof word_end);
size_t i;
strcpy (n, match_lines ? line_beg : word_beg);
i = strlen(n);
memcpy(n + i, pattern, size);
memcpy (n + i, pattern, size);
i += size;
if (match_words)
strcpy(n + i, ")([^[:alnum:]_]|$)");
if (match_lines)
strcpy(n + i, ")$");
i += strlen(n + i);
dfacomp(n, i, &dfa, 1);
strcpy (n + i, match_lines ? line_end : word_end);
i += strlen (n + i);
pattern = n;
size = i;
}
else
dfacomp(pattern, size, &dfa, 1);
kwsmusts();
dfacomp (pattern, size, &dfa, 1);
kwsmusts ();
}
static char *
EGexecute (char *buf, size_t size, char **endp)
static size_t
EGexecute (char const *buf, size_t size, size_t *match_size, int exact)
{
register char *buflim, *beg, *end, save;
register char const *buflim, *beg, *end;
char eol = eolbyte;
int backref, start, len;
struct kwsmatch kwsm;
static struct re_registers regs; /* This is static on account of a BRAIN-DEAD
Q@#%!# library interface in regex.c. */
size_t i;
#ifdef MBS_SUPPORT
char *mb_properties = NULL;
#endif /* MBS_SUPPORT */
#ifdef MBS_SUPPORT
if (MB_CUR_MAX > 1 && kwset)
mb_properties = check_multibyte_string(buf, size);
#endif /* MBS_SUPPORT */
buflim = buf + size;
for (beg = end = buf; end < buflim; beg = end + 1)
for (beg = end = buf; end < buflim; beg = end)
{
if (kwset)
if (!exact)
{
/* Find a possible match using the KWset matcher. */
beg = kwsexec(kwset, beg, buflim - beg, &kwsm);
if (!beg)
goto failure;
/* Narrow down to the line containing the candidate, and
run it through DFA. */
end = memchr(beg, eol, buflim - beg);
if (!end)
end = buflim;
while (beg > buf && beg[-1] != eol)
--beg;
save = *end;
if (kwsm.index < lastexact)
goto success;
if (!dfaexec(&dfa, beg, end, 0, (int *) 0, &backref))
if (kwset)
{
*end = save;
continue;
/* Find a possible match using the KWset matcher. */
size_t offset = kwsexec (kwset, beg, buflim - beg, &kwsm);
if (offset == (size_t) -1)
{
#ifdef MBS_SUPPORT
if (MB_CUR_MAX > 1)
free(mb_properties);
#endif
return (size_t)-1;
}
beg += offset;
/* Narrow down to the line containing the candidate, and
run it through DFA. */
end = memchr(beg, eol, buflim - beg);
end++;
#ifdef MBS_SUPPORT
if (MB_CUR_MAX > 1 && mb_properties[beg - buf] == 0)
continue;
#endif
while (beg > buf && beg[-1] != eol)
--beg;
if (kwsm.index < kwset_exact_matches)
goto success;
if (dfaexec (&dfa, beg, end - beg, &backref) == (size_t) -1)
continue;
}
else
{
/* No good fixed strings; start with DFA. */
size_t offset = dfaexec (&dfa, beg, buflim - beg, &backref);
if (offset == (size_t) -1)
break;
/* Narrow down to the line we've found. */
beg += offset;
end = memchr (beg, eol, buflim - beg);
end++;
while (beg > buf && beg[-1] != eol)
--beg;
}
*end = save;
/* Successful, no backreferences encountered. */
if (!backref)
goto success;
}
else
{
/* No good fixed strings; start with DFA. */
save = *buflim;
beg = dfaexec(&dfa, beg, buflim, 0, (int *) 0, &backref);
*buflim = save;
if (!beg)
goto failure;
/* Narrow down to the line we've found. */
end = memchr(beg, eol, buflim - beg);
if (!end)
end = buflim;
while (beg > buf && beg[-1] != eol)
--beg;
/* Successful, no backreferences encountered! */
if (!backref)
goto success;
}
else
end = beg + size;
/* If we've made it to this point, this means DFA has seen
a probable match, and we need to run it through Regex. */
regexbuf.not_eol = 0;
if ((start = re_search(&regexbuf, beg, end - beg, 0, end - beg, &regs)) >= 0)
for (i = 0; i < pcount; i++)
{
len = regs.end[0] - start;
if ((!match_lines && !match_words)
|| (match_lines && len == end - beg))
goto success;
/* If -w, check if the match aligns with word boundaries.
We do this iteratively because:
(a) the line may contain more than one occurence of the pattern, and
(b) Several alternatives in the pattern might be valid at a given
point, and we may need to consider a shorter one to find a word
boundary. */
if (match_words)
while (start >= 0)
{
if ((start == 0 || !WCHAR ((unsigned char) beg[start - 1]))
&& (len == end - beg
|| !WCHAR ((unsigned char) beg[start + len])))
goto success;
if (len > 0)
patterns[i].regexbuf.not_eol = 0;
if (0 <= (start = re_search (&(patterns[i].regexbuf), beg,
end - beg - 1, 0,
end - beg - 1, &(patterns[i].regs))))
{
len = patterns[i].regs.end[0] - start;
if (exact)
{
*match_size = len;
return start;
}
if ((!match_lines && !match_words)
|| (match_lines && len == end - beg - 1))
goto success;
/* If -w, check if the match aligns with word boundaries.
We do this iteratively because:
(a) the line may contain more than one occurence of the
pattern, and
(b) Several alternatives in the pattern might be valid at a
given point, and we may need to consider a shorter one to
find a word boundary. */
if (match_words)
while (start >= 0)
{
/* Try a shorter length anchored at the same place. */
--len;
regexbuf.not_eol = 1;
len = re_match(&regexbuf, beg, start + len, start, &regs);
if ((start == 0 || !WCHAR ((unsigned char) beg[start - 1]))
&& (len == end - beg - 1
|| !WCHAR ((unsigned char) beg[start + len])))
goto success;
if (len > 0)
{
/* Try a shorter length anchored at the same place. */
--len;
patterns[i].regexbuf.not_eol = 1;
len = re_match (&(patterns[i].regexbuf), beg,
start + len, start,
&(patterns[i].regs));
}
if (len <= 0)
{
/* Try looking further on. */
if (start == end - beg - 1)
break;
++start;
patterns[i].regexbuf.not_eol = 0;
start = re_search (&(patterns[i].regexbuf), beg,
end - beg - 1,
start, end - beg - 1 - start,
&(patterns[i].regs));
len = patterns[i].regs.end[0] - start;
}
}
if (len <= 0)
{
/* Try looking further on. */
if (start == end - beg)
break;
++start;
regexbuf.not_eol = 0;
start = re_search(&regexbuf, beg, end - beg,
start, end - beg - start, &regs);
len = regs.end[0] - start;
}
}
}
}
failure:
return 0;
}
} /* for Regex patterns. */
} /* for (beg = end ..) */
#ifdef MBS_SUPPORT
if (MB_CUR_MAX > 1 && mb_properties)
free (mb_properties);
#endif /* MBS_SUPPORT */
return (size_t) -1;
success:
*endp = end < buflim ? end + 1 : end;
return beg;
#ifdef MBS_SUPPORT
if (MB_CUR_MAX > 1 && mb_properties)
free (mb_properties);
#endif /* MBS_SUPPORT */
*match_size = end - beg;
return beg - buf;
}
static void
Fcompile (char *pattern, size_t size)
Fcompile (char const *pattern, size_t size)
{
char *beg, *lim, *err;
char const *beg, *lim, *err;
kwsinit();
kwsinit ();
beg = pattern;
do
{
for (lim = beg; lim < pattern + size && *lim != '\n'; ++lim)
;
if ((err = kwsincr(kwset, beg, lim - beg)) != 0)
fatal(err, 0);
if ((err = kwsincr (kwset, beg, lim - beg)) != 0)
error (2, 0, err);
if (lim < pattern + size)
++lim;
beg = lim;
}
while (beg < pattern + size);
if ((err = kwsprep(kwset)) != 0)
fatal(err, 0);
if ((err = kwsprep (kwset)) != 0)
error (2, 0, err);
}
static char *
Fexecute (char *buf, size_t size, char **endp)
static size_t
Fexecute (char const *buf, size_t size, size_t *match_size, int exact)
{
register char *beg, *try, *end;
register char const *beg, *try, *end;
register size_t len;
char eol = eolbyte;
struct kwsmatch kwsmatch;
#ifdef MBS_SUPPORT
char *mb_properties;
if (MB_CUR_MAX > 1)
mb_properties = check_multibyte_string (buf, size);
#endif /* MBS_SUPPORT */
for (beg = buf; beg <= buf + size; ++beg)
{
if (!(beg = kwsexec(kwset, beg, buf + size - beg, &kwsmatch)))
return 0;
size_t offset = kwsexec (kwset, beg, buf + size - beg, &kwsmatch);
if (offset == (size_t) -1)
{
#ifdef MBS_SUPPORT
if (MB_CUR_MAX > 1)
free(mb_properties);
#endif /* MBS_SUPPORT */
return offset;
}
#ifdef MBS_SUPPORT
if (MB_CUR_MAX > 1 && mb_properties[offset+beg-buf] == 0)
continue; /* It is a part of multibyte character. */
#endif /* MBS_SUPPORT */
beg += offset;
len = kwsmatch.size[0];
if (exact)
{
*match_size = len;
#ifdef MBS_SUPPORT
if (MB_CUR_MAX > 1)
free (mb_properties);
#endif /* MBS_SUPPORT */
return beg - buf;
}
if (match_lines)
{
if (beg > buf && beg[-1] != eol)
@ -391,13 +547,22 @@ Fexecute (char *buf, size_t size, char **endp)
goto success;
}
else if (match_words)
for (try = beg; len && try;)
for (try = beg; len; )
{
if (try > buf && WCHAR((unsigned char) try[-1]))
break;
if (try + len < buf + size && WCHAR((unsigned char) try[len]))
{
try = kwsexec(kwset, beg, --len, &kwsmatch);
offset = kwsexec (kwset, beg, --len, &kwsmatch);
if (offset == (size_t) -1)
{
#ifdef MBS_SUPPORT
if (MB_CUR_MAX > 1)
free (mb_properties);
#endif /* MBS_SUPPORT */
return offset;
}
try = beg + offset;
len = kwsmatch.size[0];
}
else
@ -407,15 +572,153 @@ Fexecute (char *buf, size_t size, char **endp)
goto success;
}
return 0;
#ifdef MBS_SUPPORT
if (MB_CUR_MAX > 1)
free (mb_properties);
#endif /* MBS_SUPPORT */
return -1;
success:
if ((end = memchr(beg + len, eol, (buf + size) - (beg + len))) != 0)
++end;
else
end = buf + size;
*endp = end;
while (beg > buf && beg[-1] != '\n')
end = memchr (beg + len, eol, (buf + size) - (beg + len));
end++;
while (buf < beg && beg[-1] != eol)
--beg;
return beg;
*match_size = end - beg;
#ifdef MBS_SUPPORT
if (MB_CUR_MAX > 1)
free (mb_properties);
#endif /* MBS_SUPPORT */
return beg - buf;
}
#if HAVE_LIBPCRE
/* Compiled internal form of a Perl regular expression. */
static pcre *cre;
/* Additional information about the pattern. */
static pcre_extra *extra;
#endif
static void
Pcompile (char const *pattern, size_t size)
{
#if !HAVE_LIBPCRE
error (2, 0, _("The -P option is not supported"));
#else
int e;
char const *ep;
char *re = xmalloc (4 * size + 7);
int flags = PCRE_MULTILINE | (match_icase ? PCRE_CASELESS : 0);
char const *patlim = pattern + size;
char *n = re;
char const *p;
char const *pnul;
/* FIXME: Remove this restriction. */
if (eolbyte != '\n')
error (2, 0, _("The -P and -z options cannot be combined"));
*n = '\0';
if (match_lines)
strcpy (n, "^(");
if (match_words)
strcpy (n, "\\b(");
n += strlen (n);
/* The PCRE interface doesn't allow NUL bytes in the pattern, so
replace each NUL byte in the pattern with the four characters
"\000", removing a preceding backslash if there are an odd
number of backslashes before the NUL.
FIXME: This method does not work with some multibyte character
encodings, notably Shift-JIS, where a multibyte character can end
in a backslash byte. */
for (p = pattern; (pnul = memchr (p, '\0', patlim - p)); p = pnul + 1)
{
memcpy (n, p, pnul - p);
n += pnul - p;
for (p = pnul; pattern < p && p[-1] == '\\'; p--)
continue;
n -= (pnul - p) & 1;
strcpy (n, "\\000");
n += 4;
}
memcpy (n, p, patlim - p);
n += patlim - p;
*n = '\0';
if (match_words)
strcpy (n, ")\\b");
if (match_lines)
strcpy (n, ")$");
cre = pcre_compile (re, flags, &ep, &e, pcre_maketables ());
if (!cre)
error (2, 0, ep);
extra = pcre_study (cre, 0, &ep);
if (ep)
error (2, 0, ep);
free (re);
#endif
}
static size_t
Pexecute (char const *buf, size_t size, size_t *match_size, int exact)
{
#if !HAVE_LIBPCRE
abort ();
return -1;
#else
/* This array must have at least two elements; everything after that
is just for performance improvement in pcre_exec. */
int sub[300];
int e = pcre_exec (cre, extra, buf, size, 0, 0,
sub, sizeof sub / sizeof *sub);
if (e <= 0)
{
switch (e)
{
case PCRE_ERROR_NOMATCH:
return -1;
case PCRE_ERROR_NOMEMORY:
error (2, 0, _("Memory exhausted"));
default:
abort ();
}
}
else
{
/* Narrow down to the line we've found. */
char const *beg = buf + sub[0];
char const *end = buf + sub[1];
char const *buflim = buf + size;
char eol = eolbyte;
if (!exact)
{
end = memchr (end, eol, buflim - end);
end++;
while (buf < beg && beg[-1] != eol)
--beg;
}
*match_size = end - beg;
return beg - buf;
}
#endif
}
struct matcher const matchers[] = {
{ "default", Gcompile, EGexecute },
{ "grep", Gcompile, EGexecute },
{ "egrep", Ecompile, EGexecute },
{ "awk", Ecompile, EGexecute },
{ "fgrep", Fcompile, Fexecute },
{ "perl", Pcompile, Pexecute },
{ "", 0, 0 },
};

View File

@ -53,18 +53,16 @@ extern char *sys_errlist[];
#endif
/* Some operating systems treat text and binary files differently. */
#if O_BINARY
#ifdef __BEOS__
# undef O_BINARY /* BeOS 5 has O_BINARY and O_TEXT, but they have no effect. */
#endif
#ifdef HAVE_DOS_FILE_CONTENTS
# include <io.h>
# ifdef HAVE_SETMODE
# define SET_BINARY(fd) setmode (fd, O_BINARY)
# else
# define SET_BINARY(fd) _setmode (fd, O_BINARY)
# endif
#else
# ifndef O_BINARY
# define O_BINARY 0
# define SET_BINARY(fd) (void)0
# endif
#endif
#ifdef HAVE_DOS_FILE_NAMES
@ -80,14 +78,15 @@ extern char *sys_errlist[];
# define FILESYSTEM_PREFIX_LEN(f) 0
#endif
/* This assumes _WIN32, like DJGPP, has D_OK. Does it? In what header? */
#ifdef D_OK
int isdir PARAMS ((char const *));
#ifdef HAVE_DIR_EACCES_BUG
# ifdef EISDIR
# define is_EISDIR(e, f) \
((e) == EISDIR \
|| ((e) == EACCES && access (f, D_OK) == 0 && ((e) = EISDIR, 1)))
|| ((e) == EACCES && isdir (f) && ((e) = EISDIR, 1)))
# else
# define is_EISDIR(e, f) ((e) == EACCES && access (f, D_OK) == 0)
# define is_EISDIR(e, f) ((e) == EACCES && isdir (f))
# endif
#endif

View File

@ -0,0 +1,38 @@
#!/bin/sh
# Test that backrefs are local to regex.
#
#
: ${srcdir=.}
failures=0
# checking for a palindrome
echo "radar" | ${GREP} -e '\(.\)\(.\).\2\1' > /dev/null 2>&1
if test $? -ne 0 ; then
echo "backref: palindrome, test \#1 failed"
failures=1
fi
# hit hard with the `Bond' tests
echo "civic" | ${GREP} -E -e '^(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?)(.?).?\9\8\7\6\5\4\3\2\1$' > /dev/null 2>&1
if test $? -ne 0 ; then
echo "Options: Bond, test \#2 failed"
failures=1
fi
# backref are local should be error
echo "123" | ${GREP} -e 'a\(.\)' -e 'b\1' > /dev/null 2>&1
if test $? -ne 2 ; then
echo "Options: Backref not local, test \#3 failed"
failures=1
fi
# Pattern should faile
echo "123" | ${GREP} -e '[' -e ']' > /dev/null 2>&1
if test $? -ne 2 ; then
echo "Options: Compiled not local, test \#3 failed"
failures=1
fi
exit $failures

View File

@ -8,7 +8,8 @@ BEGIN {
$0 ~ /^#/ { next; }
NF == 3 {
printf ("status=`echo '%s' | { ${GREP} -e '%s' > /dev/null 2>&1; echo $?; cat >/dev/null; }`\n",$3, $2);
# printf ("status=`echo '%s' | { ${GREP} -e '%s' > /dev/null 2>&1; echo $?; cat >/dev/null; }`\n",$3, $2);
printf ("status=`echo '%s' | { ${GREP} -e '%s' > /dev/null 2>&1; echo $? ; }`\n",$3, $2);
printf ("if test $status -ne %s ; then\n", $1);
printf ("\techo Spencer bre test \\#%d failed\n", ++n);
printf ("\tfailures=1\n");

View File

@ -1,4 +1,4 @@
#! /bin/sh
#!/bin/sh
# Regression test for GNU grep.
: ${srcdir=.}

View File

@ -17,7 +17,7 @@
2@\(\{1\}a\)@BADRPT@TO CORRECT
0@^*@*
2@^\{1\}@BADRPT@TO CORRECT
0@\{@{
0@{@{
1@a\(b*\)c\1d@abbcbd
1@a\(b*\)c\1d@abbcbbbd
1@^\(.\)\1@abc

View File

@ -1,4 +1,4 @@
#! /bin/sh
#!/bin/sh
# test that the empty file means no pattern
# and an empty pattern means match all.
@ -6,25 +6,28 @@
failures=0
# should return 0 found a match
echo "abcd" | ${GREP} -E -e '' > /dev/null 2>&1
if test $? -ne 0 ; then
echo "Status: Wrong status code, test \#1 failed"
failures=1
fi
for options in '-E' '-E -w' '-F -x' '-G -w -x'; do
# should return 1 found no match
echo "abcd" | ${GREP} -E -f /dev/null > /dev/null 2>&1
if test $? -ne 1 ; then
echo "Status: Wrong status code, test \#2 failed"
failures=1
fi
# should return 0 found a match
echo "" | ${GREP} $options -e '' > /dev/null 2>&1
if test $? -ne 0 ; then
echo "Status: Wrong status code, test \#1 failed ($options)"
failures=1
fi
# should return 0 found a match
echo "abcd" | ${GREP} -E -f /dev/null -e "abc" > /dev/null 2>&1
if test $? -ne 0 ; then
echo "Status: Wrong status code, test \#3 failed"
failures=1
fi
# should return 1 found no match
echo "abcd" | ${GREP} $options -f /dev/null > /dev/null 2>&1
if test $? -ne 1 ; then
echo "Status: Wrong status code, test \#2 failed ($options)"
failures=1
fi
# should return 0 found a match
echo "abcd" | ${GREP} $options -f /dev/null -e "abcd" > /dev/null 2>&1
if test $? -ne 0 ; then
echo "Status: Wrong status code, test \#3 failed ($options)"
failures=1
fi
done
exit $failures

View File

@ -8,7 +8,8 @@ BEGIN {
$0 ~ /^#/ { next; }
NF == 3 {
printf ("status=`echo '%s' | { ${GREP} -E -e '%s' > /dev/null 2>&1; echo $?; cat >/dev/null; }`\n",$3, $2);
# printf ("status=`echo '%s' | { ${GREP} -E -e '%s' > /dev/null 2>&1; echo $?; cat >/dev/null; }`\n",$3, $2);
printf ("status=`echo '%s' | { ${GREP} -E -e '%s' > /dev/null 2>&1; echo $?; }`\n",$3, $2);
printf ("if test $status -ne %s ; then\n", $1);
printf ("\techo Spencer ere test \\#%d failed\n", ++n);
printf ("\tfailures=1\n");

View File

@ -1,4 +1,4 @@
#! /bin/sh
#!/bin/sh
# Regression test for GNU grep.
: ${srcdir=.}

59
gnu/usr.bin/grep/tests/file.sh Executable file
View File

@ -0,0 +1,59 @@
#!/bin/sh
# Test for POSIX.2 options for grep
#
# grep -E -f pattern_file file
# grep -F -f pattern_file file
# grep -G -f pattern_file file
#
: ${srcdir=.}
failures=0
cat <<EOF >patfile
radar
MILES
GNU
EOF
# match
echo "miles" | ${GREP} -i -E -f patfile > /dev/null 2>&1
if test $? -ne 0 ; then
echo "File_pattern: Wrong status code, test \#1 failed"
failures=1
fi
# match
echo "GNU" | ${GREP} -G -f patfile > /dev/null 2>&1
if test $? -ne 0 ; then
echo "File_pattern: Wrong status code, test \#2 failed"
failures=1
fi
# checking for no match
echo "ridar" | ${GREP} -F -f patfile > /dev/null 2>&1
if test $? -ne 1 ; then
echo "File_pattern: Wrong status code, test \#3 failed"
failures=1
fi
cat <<EOF >patfile
EOF
# empty pattern : every match
echo "abbcd" | ${GREP} -F -f patfile > /dev/null 2>&1
if test $? -ne 0 ; then
echo "File_pattern: Wrong status code, test \#4 failed"
failures=1
fi
cp /dev/null patfile
# null pattern : no match
echo "abbcd" | ${GREP} -F -f patfile > /dev/null 2>&1
if test $? -ne 1 ; then
echo "File_pattern: Wrong status code, test \#5 failed"
failures=1
fi
exit $failures

View File

@ -0,0 +1,55 @@
#
# Basic Regular Expression
# kip comments
$0 ~ /^#/ { next; }
# skip those option specific to regexec/regcomp
$2 ~ /[msnr$#p^]/ { next; }
# skip empty lines
$0 ~ /^$/ { next; }
# debug
#{ printf ("<%s> <%s> <%s> <%s>\n", $1, $2, $3, $4); }
# subreg expresion
NF >= 5 { next; }
# errors
NF == 3 {
# gsub (/@/, ",");
# it means empty lines
gsub (/\"\"/, "");
# escapes
gsub (/\\\'/, "\\\'\'");
# error in regex
if (index ($2, "C") != 0)
{
if (index ($2, "b") != 0)
printf ("2@%s@%s\n", $1, $3);
}
# erro no match
else
{
if (index ($2, "b") != 0)
printf ("1@%s@%s\n", $1, $3);
}
next;
}
# ok
NF == 4 {
# skip those magic cookies can't rely on echo to gnerate them
if (match($3, /[NSTZ]/))
next;
# gsub (/@/, ",");
# it means empty lines
gsub (/\"\"/, "");
# escape escapes
gsub (/\\\'/, "\\\'\'");
if (index ($2, "b") != 0)
printf ("0@%s@%s\n", $1, $3);
}

View File

@ -0,0 +1,60 @@
#
# Extended Regular Expression
# skip comments
$0 ~ /^#/ { next; }
# skip specifics to regcomp/regexec
$2 ~ /[msnr$#p^]/ { next; }
# jump empty lines
$0 ~ /^$/ { next; }
# subreg skip
NF >= 5 { next; }
# debug
#{ printf ("<%s> <%s> <%s> <%s>\n", $1, $2, $3, $4); }
# errors
NF == 3 {
# nuke any remaining '@'
# gsub (/@/, ",");
# it means empty lines
gsub (/\"\"/, "");
# escapes
gsub (/\\\'/, "\\\'\'");
# error in regex
if (index ($2, "C") != 0)
{
if (index ($2, "b") == 0)
printf ("2@%s@%s\n", $1, $3);
}
# error not matching
else
{
if (index ($2, "b") == 0)
printf ("1@%s@%s\n", $1, $3);
}
next;
}
# ok
NF == 4 {
# skip those magic cookies can't rely on echo to gnerate them
if (match($3, /[NSTZ]/))
next;
# nuke any remaining '@'
# gsub (/@/, ",");
# it means empty lines
gsub (/\"\"/, "");
# escape escapes
gsub (/\\\'/, "\\\'\'");
if (index ($2, "b") == 0)
{
printf ("0@%s@%s\n", $1, $3);
}
next;
}

View File

@ -1,4 +1,4 @@
#! /bin/sh
#!/bin/sh
# Regression test for GNU grep.
: ${srcdir=.}

View File

@ -1,4 +1,4 @@
#! /bin/sh
#!/bin/sh
# Test for POSIX.2 options for grep
#
# grep [ -E| -F][ -c| -l| -q ][-insvx] -e pattern_list

View File

@ -4,7 +4,8 @@ BEGIN {
}
$0 !~ /^#/ && NF = 3 {
printf ("status=`echo '%s'| { ${GREP} -E -e '%s' > /dev/null 2>&1; echo $?; cat >/dev/null; }`\n",$3, $2);
# printf ("status=`echo '%s'| { ${GREP} -E -e '%s' > /dev/null 2>&1; echo $?; cat >/dev/null; }`\n",$3, $2);
printf ("status=`echo '%s'| { ${GREP} -E -e '%s' >/dev/null 2>&1 ; echo $?; }`\n",$3, $2);
printf ("if test $status -ne %s ; then\n", $1);
printf ("\techo Spencer test \\#%d failed\n", ++n);
printf ("\tfailures=1\n");

View File

@ -1,4 +1,4 @@
#! /bin/sh
#!/bin/sh
# Regression test for GNU grep.
: ${srcdir=.}

View File

@ -0,0 +1,13 @@
#!/bin/sh
# Regression test for GNU grep.
: ${srcdir=.}
failures=0
# . . . and the following by Henry Spencer.
${AWK-awk} -f $srcdir/scriptgen.awk $srcdir/spencer2.tests > tmp2.script
sh tmp2.script && exit $failures
exit 1

View File

@ -0,0 +1,317 @@
0@a@a
0@abc@abc
0@abc|de@abc
0@a|b|c@abc
0@a(b)c@abc
1@a\(b\)c@abc
2@a(@EPAREN
2@a(@a(
0@a\(@a(
1@a\(@EPAREN
1@a\(b@EPAREN
2@a(b@EPAREN
2@a(b@a(b
2@a)@a)
2@)@)
2@a)@a)
1@a\)@EPAREN
1@\)@EPAREN
0@a()b@ab
1@a\(\)b@ab
0@^abc$@abc
1@a^b@a^b
1@a^b@a^b
1@a$b@a$b
1@a$b@a$b
0@^@abc
0@$@abc
1@^$@""
1@$^@""
1@\($\)\(^\)@""
0@^^@""
0@$$@""
1@b$@abNc
1@b$@abNc
1@^b$@aNbNc
1@^b$@aNbNc
1@^$@aNNb
1@^$@abc
1@^$@abcN
1@$^@aNNb
1@\($\)\(^\)@aNNb
0@^^@aNNb
0@$$@aNNb
0@^a@a
0@a$@a
0@^a@aNb
1@^b@aNb
0@a$@bNa
1@b$@bNa
0@a*(^b$)c*@b
1@a*\(^b$\)c*@b
0@|@EMPTY
0@|@|
0@*@BADRPT
0@*@*
0@+@BADRPT
0@?@BADRPT
1@""@EMPTY
0@()@abc
1@\(\)@abc
0@a||b@EMPTY
0@|ab@EMPTY
0@ab|@EMPTY
1@(|a)b@EMPTY
1@(a|)b@EMPTY
1@(*a)@BADRPT
1@(+a)@BADRPT
1@(?a)@BADRPT
1@({1}a)@BADRPT
1@\(\{1\}a\)@BADRPT
1@(a|*b)@BADRPT
1@(a|+b)@BADRPT
1@(a|?b)@BADRPT
1@(a|{1}b)@BADRPT
0@^*@BADRPT
0@^*@*
0@^+@BADRPT
0@^?@BADRPT
0@^{1}@BADRPT
1@^\{1\}@BADRPT
0@a.c@abc
0@a[bc]d@abd
0@a\*c@a*c
1@ac@abc
1@a\bc@ac
1@\{@BADRPT
0@a\[b@a[b
2@a[b@EBRACK
0@a$@a
1@a$@a$
1@a\$@a
0@a\$@a$
1@a\$@a
1@a\$@a\$
2@a\(b\)\2c@ESUBREG
2@a\(b\1\)c@ESUBREG
2@a\(b*\)c\1d@abbcbd
2@a\(b*\)c\1d@abbcbbbd
2@^\(.\)\1@abc
2@a\(\([bc]\)\2\)*d@abbccd
2@a\(\([bc]\)\2\)*d@abbcbd
2@a\(\(b\)*\2\)*d@abbbd
2@\(a\)\1bcd@aabcd
2@\(a\)\1bc*d@aabcd
2@\(a\)\1bc*d@aabd
2@\(a\)\1bc*d@aabcccd
2@\(a\)\1bc*[ce]d@aabcccd
2@^\(a\)\1b\(c\)*cd$@aabcccd
0@ab*c@abc
0@ab+c@abc
0@ab?c@abc
1@a\(*\)b@a*b
1@a\(**\)b@ab
1@a\(***\)b@BADRPT
0@*a@*a
0@**a@a
1@***a@BADRPT
2@{@{
2@{abc@{abc
2@{1@BADRPT
0@{1}@BADRPT
2@a{b@a{b
0@a{1}b@ab
1@a\{1\}b@ab
0@a{1,}b@ab
1@a\{1,\}b@ab
0@a{1,2}b@aab
1@a\{1,2\}b@aab
2@a{1@EBRACE
1@a\{1@EBRACE
2@a{1a@EBRACE
1@a\{1a@EBRACE
2@a{1a}@BADBR
1@a\{1a\}@BADBR
0@a{,2}@a{,2}
1@a\{,2\}@BADBR
0@a{,}@a{,}
1@a\{,\}@BADBR
2@a{1,x}@BADBR
1@a\{1,x\}@BADBR
2@a{1,x@EBRACE
1@a\{1,x@EBRACE
1@a{300}@BADBR
1@a\{300\}@BADBR
1@a{1,0}@BADBR
1@a\{1,0\}@BADBR
0@ab{0,0}c@abcac
1@ab\{0,0\}c@abcac
0@ab{0,1}c@abcac
1@ab\{0,1\}c@abcac
0@ab{0,3}c@abbcac
1@ab\{0,3\}c@abbcac
0@ab{1,1}c@acabc
1@ab\{1,1\}c@acabc
0@ab{1,3}c@acabc
1@ab\{1,3\}c@acabc
0@ab{2,2}c@abcabbc
1@ab\{2,2\}c@abcabbc
0@ab{2,4}c@abcabbc
1@ab\{2,4\}c@abcabbc
0@a**@BADRPT
1@a++@BADRPT
0@a??@BADRPT
0@a*+@BADRPT
0@a*?@BADRPT
0@a+*@BADRPT
0@a+?@BADRPT
0@a?*@BADRPT
0@a?+@BADRPT
1@a{1}{1}@BADRPT
0@a*{1}@BADRPT
1@a+{1}@BADRPT
0@a?{1}@BADRPT
0@a{1}*@BADRPT
1@a{1}+@BADRPT
0@a{1}?@BADRPT
2@a*{b}@a{b}
1@a\{1\}\{1\}@BADRPT
1@a*\{1\}@BADRPT
1@a\{1\}*@BADRPT
0@a[b]c@abc
0@a[ab]c@abc
0@a[^ab]c@adc
0@a[]b]c@a]c
0@a[[b]c@a[c
0@a[-b]c@a-c
0@a[^]b]c@adc
0@a[^-b]c@adc
0@a[b-]c@a-c
2@a[b@EBRACK
2@a[]@EBRACK
0@a[1-3]c@a2c
1@a[3-1]c@ERANGE
1@a[1-3-5]c@ERANGE
1@a[[.-.]--]c@a-c
2@a[1-@ERANGE
2@a[[.@EBRACK
2@a[[.x@EBRACK
2@a[[.x.@EBRACK
1@a[[.x.]@EBRACK
1@a[[.x.]]@ax
1@a[[.x,.]]@ECOLLATE
1@a[[.one.]]b@a1b
1@a[[.notdef.]]b@ECOLLATE
1@a[[.].]]b@a]b
0@a[[:alpha:]]c@abc
2@a[[:notdef:]]c@ECTYPE
2@a[[:@EBRACK
2@a[[:alpha@EBRACK
2@a[[:alpha:]@EBRACK
2@a[[:alpha,:]@ECTYPE
2@a[[:]:]]b@ECTYPE
2@a[[:-:]]b@ECTYPE
2@a[[:alph:]]@ECTYPE
2@a[[:alphabet:]]@ECTYPE
1@[[:blank:]]+@aSSTb
1@[[:cntrl:]]+@aNTb
0@[[:digit:]]+@a019b
0@[[:graph:]]+@Sa%bS
0@[[:lower:]]+@AabC
0@[[:print:]]+@NaSbN
0@[[:punct:]]+@S%-&T
1@[[:space:]]+@aSNTb
0@[[:upper:]]+@aBCd
0@[[:xdigit:]]+@p0f3Cq
1@a[[=b=]]c@abc
2@a[[=@EBRACK
2@a[[=b@EBRACK
2@a[[=b=@EBRACK
1@a[[=b=]@EBRACK
1@a[[=b,=]]@ECOLLATE
1@a[[=one=]]b@a1b
0@a(((b)))c@abc
0@a(b|(c))d@abd
0@a(b*|c)d@abbd
0@a[ab]{20}@aaaaabaaaabaaaabaaaab
0@a[ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab]@aaaaabaaaabaaaabaaaab
0@a[ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab](wee|week)(knights|night)@aaaaabaaaabaaaabaaaabweeknights
0@12345678901234567890123456789@a12345678901234567890123456789b
0@123456789012345678901234567890@a123456789012345678901234567890b
0@1234567890123456789012345678901@a1234567890123456789012345678901b
0@12345678901234567890123456789012@a12345678901234567890123456789012b
0@123456789012345678901234567890123@a123456789012345678901234567890123b
0@1234567890123456789012345678901234567890123456789012345678901234567890@a1234567890123456789012345678901234567890123456789012345678901234567890b
0@[ab][cd][ef][gh][ij][kl][mn]@xacegikmoq
0@[ab][cd][ef][gh][ij][kl][mn][op]@xacegikmoq
0@[ab][cd][ef][gh][ij][kl][mn][op][qr]@xacegikmoqy
0@[ab][cd][ef][gh][ij][kl][mn][op][q]@xacegikmoqy
0@abc@xabcy
2@a\(b\)?c\1d@acd
1@aBc@Abc
1@a[Bc]*d@abBCcd
1@0[[:upper:]]1@0a1
1@0[[:lower:]]1@0A1
1@a[^b]c@abc
0@a[^b]c@aBc
0@a[^b]c@adc
0@[a]b[c]@abc
0@[a]b[a]@aba
0@[abc]b[abc]@abc
0@[abc]b[abd]@abd
0@a(b?c)+d@accd
0@(wee|week)(knights|night)@weeknights
0@(we|wee|week|frob)(knights|night|day)@weeknights
0@a[bc]d@xyzaaabcaababdacd
0@a[ab]c@aaabc
0@abc@abc
0@a*@b
0@/\*.*\*/@/*x*/
0@/\*.*\*/@/*x*/y/*z*/
0@/\*([^*]|\*[^/])*\*/@/*x*/
0@/\*([^*]|\*[^/])*\*/@/*x*/y/*z*/
0@/\*([^*]|\*[^/])*\*/@/*x**/y/*z*/
0@/\*([^*]|\*+[^*/])*\*+/@/*x*/
0@/\*([^*]|\*+[^*/])*\*+/@/*x*/y/*z*/
0@/\*([^*]|\*+[^*/])*\*+/@/*x**/y/*z*/
0@/\*([^*]|\*+[^*/])*\*+/@/*x****/y/*z*/
0@/\*([^*]|\*+[^*/])*\*+/@/*x**x*/y/*z*/
0@/\*([^*]|\*+[^*/])*\*+/@/*x***x/y/*z*/
0@[abc]@a(b)c
0@[abc]@a(d)c
0@[abc]@a(bc)d
0@[abc]@a(dc)d
0@.@a()c
0@b.*c@b(bc)c
0@b.*@b(bc)c
0@.*c@b(bc)c
0@abc@abc
0@abc@xabcy
1@abc@xyz
0@a*b@aba*b
0@a*b@ab
1@""@EMPTY
1@aZb@a
1@aZb@a
0@aZb@(aZb)
0@aZ*b@(ab)
0@a.b@(aZb)
0@a.*@(aZb)c
2@[[:<:]]a@a
2@[[:<:]]a@ba
2@[[:<:]]a@-a
2@a[[:>:]]@a
2@a[[:>:]]@ab
2@a[[:>:]]@a-
2@[[:<:]]a.c[[:>:]]@axcd-dayc-dazce-abc
2@[[:<:]]a.c[[:>:]]@axcd-dayc-dazce-abc-q
2@[[:<:]]a.c[[:>:]]@axc-dayc-dazce-abc
2@[[:<:]]b.c[[:>:]]@a_bxc-byc_d-bzc-q
2@[[:<:]].x..[[:>:]]@y_xa_-_xb_y-_xc_-axdc
2@[[:<:]]a_b[[:>:]]@x_a_b
0@(A[1])|(A[2])|(A[3])|(A[4])|(A[5])|(A[6])|(A[7])|(A[8])|(A[9])|(A[A])@A1
0@abcdefghijklmnop@abcdefghijklmnop
0@abcdefghijklmnopqrstuv@abcdefghijklmnopqrstuv
0@CC[13]1|a{21}[23][EO][123][Es][12]a{15}aa[34][EW]aaaaaaa[X]a@CC11
0@a?b@ab
1@-\{0,1\}[0-9]*$@-5

View File

@ -1,4 +1,4 @@
#! /bin/sh
#!/bin/sh
# Test for status code for GNU grep.
# status code
# 0 match found
@ -24,15 +24,29 @@ if test $? -ne 1 ; then
fi
# the filename MMMMMMMM.MMM should not exist hopefully
# should return 2 file not found
if test -b MMMMMMMM.MMM; then
if test -r MMMMMMMM.MMM; then
echo "Please remove MMMMMMMM.MMM to run check"
else
${GREP} -E -e 'abc' MMMMMMMM.MMM> /dev/null 2>&1
# should return 2 file not found
${GREP} -E -e 'abc' MMMMMMMM.MMM > /dev/null 2>&1
if test $? -ne 2 ; then
echo "Status: Wrong status code, test \#3 failed"
failures=1
fi
# should return 2 file not found
${GREP} -E -s -e 'abc' MMMMMMMM.MMM > /dev/null 2>&1
if test $? -ne 2 ; then
echo "Status: Wrong status code, test \#4 failed"
failures=1
fi
# should return 0 found a match
echo "abcd" | ${GREP} -E -q -s 'abc' MMMMMMMM.MMM - > /dev/null 2>&1
if test $? -ne 0 ; then
echo "Status: Wrong status code, test \#5 failed"
failures=1
fi
fi
exit $failures

View File

@ -0,0 +1,475 @@
# regular expression test set
# Lines are at least three fields, separated by one or more tabs. "" stands
# for an empty field. First field is an RE. Second field is flags. If
# C flag given, regcomp() is expected to fail, and the third field is the
# error name (minus the leading REG_).
#
# Otherwise it is expected to succeed, and the third field is the string to
# try matching it against. If there is no fourth field, the match is
# expected to fail. If there is a fourth field, it is the substring that
# the RE is expected to match. If there is a fifth field, it is a comma-
# separated list of what the subexpressions should match, with - indicating
# no match for that one. In both the fourth and fifth fields, a (sub)field
# starting with @ indicates that the (sub)expression is expected to match
# a null string followed by the stuff after the @; this provides a way to
# test where null strings match. The character `N' in REs and strings
# is newline, `S' is space, `T' is tab, `Z' is NUL.
#
# The full list of flags:
# - placeholder, does nothing
# b RE is a BRE, not an ERE
# & try it as both an ERE and a BRE
# C regcomp() error expected, third field is error name
# i REG_ICASE
# m ("mundane") REG_NOSPEC
# s REG_NOSUB (not really testable)
# n REG_NEWLINE
# ^ REG_NOTBOL
# $ REG_NOTEOL
# # REG_STARTEND (see below)
# p REG_PEND
#
# For REG_STARTEND, the start/end offsets are those of the substring
# enclosed in ().
# basics
a & a a
abc & abc abc
abc|de - abc abc
a|b|c - abc a
# parentheses and perversions thereof
a(b)c - abc abc
a\(b\)c b abc abc
a( C EPAREN
a( b a( a(
a\( - a( a(
a\( bC EPAREN
a\(b bC EPAREN
a(b C EPAREN
a(b b a(b a(b
# gag me with a right parenthesis -- 1003.2 goofed here (my fault, partly)
a) - a) a)
) - ) )
# end gagging (in a just world, those *should* give EPAREN)
a) b a) a)
a\) bC EPAREN
\) bC EPAREN
a()b - ab ab
a\(\)b b ab ab
# anchoring and REG_NEWLINE
^abc$ & abc abc
a^b - a^b
a^b b a^b a^b
a$b - a$b
a$b b a$b a$b
^ & abc @abc
$ & abc @
^$ & "" @
$^ - "" @
\($\)\(^\) b "" @
# stop retching, those are legitimate (although disgusting)
^^ - "" @
$$ - "" @
##b$ & abNc
##b$ &n abNc b
##^b$ & aNbNc
##^b$ &n aNbNc b
##^$ &n aNNb @Nb
^$ n abc
##^$ n abcN @
##$^ n aNNb @Nb
##\($\)\(^\) bn aNNb @Nb
##^^ n^ aNNb @Nb
##$$ n aNNb @NN
^a ^ a
a$ $ a
##^a ^n aNb
##^b ^n aNb b
##a$ $n bNa
##b$ $n bNa b
a*(^b$)c* - b b
a*\(^b$\)c* b b b
# certain syntax errors and non-errors
| C EMPTY
| b | |
* C BADRPT
* b * *
+ C BADRPT
? C BADRPT
"" &C EMPTY
() - abc @abc
\(\) b abc @abc
a||b C EMPTY
|ab C EMPTY
ab| C EMPTY
(|a)b C EMPTY
(a|)b C EMPTY
(*a) C BADRPT
(+a) C BADRPT
(?a) C BADRPT
({1}a) C BADRPT
\(\{1\}a\) bC BADRPT
(a|*b) C BADRPT
(a|+b) C BADRPT
(a|?b) C BADRPT
(a|{1}b) C BADRPT
^* C BADRPT
^* b * *
^+ C BADRPT
^? C BADRPT
^{1} C BADRPT
^\{1\} bC BADRPT
# metacharacters, backslashes
a.c & abc abc
a[bc]d & abd abd
a\*c & a*c a*c
a\\b & a\b a\b
a\\\*b & a\*b a\*b
a\bc & abc abc
a\ &C EESCAPE
a\\bc & a\bc a\bc
\{ bC BADRPT
a\[b & a[b a[b
a[b &C EBRACK
# trailing $ is a peculiar special case for the BRE code
a$ & a a
a$ & a$
a\$ & a
a\$ & a$ a$
a\\$ & a
a\\$ & a$
a\\$ & a\$
a\\$ & a\ a\
# back references, ugh
##a\(b\)\2c bC ESUBREG
##a\(b\1\)c bC ESUBREG
a\(b*\)c\1d b abbcbbd abbcbbd bb
a\(b*\)c\1d b abbcbd
a\(b*\)c\1d b abbcbbbd
^\(.\)\1 b abc
a\([bc]\)\1d b abcdabbd abbd b
a\(\([bc]\)\2\)*d b abbccd abbccd
a\(\([bc]\)\2\)*d b abbcbd
# actually, this next one probably ought to fail, but the spec is unclear
a\(\(b\)*\2\)*d b abbbd abbbd
# here is a case that no NFA implementation does right
\(ab*\)[ab]*\1 b ababaaa ababaaa a
# check out normal matching in the presence of back refs
\(a\)\1bcd b aabcd aabcd
\(a\)\1bc*d b aabcd aabcd
\(a\)\1bc*d b aabd aabd
\(a\)\1bc*d b aabcccd aabcccd
\(a\)\1bc*[ce]d b aabcccd aabcccd
^\(a\)\1b\(c\)*cd$ b aabcccd aabcccd
# ordinary repetitions
ab*c & abc abc
ab+c - abc abc
ab?c - abc abc
a\(*\)b b a*b a*b
a\(**\)b b ab ab
a\(***\)b bC BADRPT
*a b *a *a
**a b a a
***a bC BADRPT
# the dreaded bounded repetitions
{ & { {
{abc & {abc {abc
{1 C BADRPT
{1} C BADRPT
a{b & a{b a{b
a{1}b - ab ab
a\{1\}b b ab ab
a{1,}b - ab ab
a\{1,\}b b ab ab
a{1,2}b - aab aab
a\{1,2\}b b aab aab
a{1 C EBRACE
a\{1 bC EBRACE
a{1a C EBRACE
a\{1a bC EBRACE
a{1a} C BADBR
a\{1a\} bC BADBR
a{,2} - a{,2} a{,2}
a\{,2\} bC BADBR
a{,} - a{,} a{,}
a\{,\} bC BADBR
a{1,x} C BADBR
a\{1,x\} bC BADBR
a{1,x C EBRACE
a\{1,x bC EBRACE
a{300} C BADBR
a\{300\} bC BADBR
a{1,0} C BADBR
a\{1,0\} bC BADBR
ab{0,0}c - abcac ac
ab\{0,0\}c b abcac ac
ab{0,1}c - abcac abc
ab\{0,1\}c b abcac abc
ab{0,3}c - abbcac abbc
ab\{0,3\}c b abbcac abbc
ab{1,1}c - acabc abc
ab\{1,1\}c b acabc abc
ab{1,3}c - acabc abc
ab\{1,3\}c b acabc abc
ab{2,2}c - abcabbc abbc
ab\{2,2\}c b abcabbc abbc
ab{2,4}c - abcabbc abbc
ab\{2,4\}c b abcabbc abbc
((a{1,10}){1,10}){1,10} - a a a,a
# multiple repetitions
a** &C BADRPT
a++ C BADRPT
a?? C BADRPT
a*+ C BADRPT
a*? C BADRPT
a+* C BADRPT
a+? C BADRPT
a?* C BADRPT
a?+ C BADRPT
a{1}{1} C BADRPT
a*{1} C BADRPT
a+{1} C BADRPT
a?{1} C BADRPT
a{1}* C BADRPT
a{1}+ C BADRPT
a{1}? C BADRPT
a*{b} - a{b} a{b}
a\{1\}\{1\} bC BADRPT
a*\{1\} bC BADRPT
a\{1\}* bC BADRPT
# brackets, and numerous perversions thereof
a[b]c & abc abc
a[ab]c & abc abc
a[^ab]c & adc adc
a[]b]c & a]c a]c
a[[b]c & a[c a[c
a[-b]c & a-c a-c
a[^]b]c & adc adc
a[^-b]c & adc adc
a[b-]c & a-c a-c
a[b &C EBRACK
a[] &C EBRACK
a[1-3]c & a2c a2c
a[3-1]c &C ERANGE
a[1-3-5]c &C ERANGE
a[[.-.]--]c & a-c a-c
a[1- &C ERANGE
a[[. &C EBRACK
a[[.x &C EBRACK
a[[.x. &C EBRACK
a[[.x.] &C EBRACK
a[[.x.]] & ax ax
a[[.x,.]] &C ECOLLATE
a[[.one.]]b & a1b a1b
a[[.notdef.]]b &C ECOLLATE
a[[.].]]b & a]b a]b
a[[:alpha:]]c & abc abc
a[[:notdef:]]c &C ECTYPE
a[[: &C EBRACK
a[[:alpha &C EBRACK
a[[:alpha:] &C EBRACK
a[[:alpha,:] &C ECTYPE
a[[:]:]]b &C ECTYPE
a[[:-:]]b &C ECTYPE
a[[:alph:]] &C ECTYPE
a[[:alphabet:]] &C ECTYPE
##[[:alnum:]]+ - -%@a0X- a0X
##[[:alpha:]]+ - -%@aX0- aX
[[:blank:]]+ - aSSTb SST
##[[:cntrl:]]+ - aNTb NT
[[:digit:]]+ - a019b 019
##[[:graph:]]+ - Sa%bS a%b
[[:lower:]]+ - AabC ab
##[[:print:]]+ - NaSbN aSb
##[[:punct:]]+ - S%-&T %-&
[[:space:]]+ - aSNTb SNT
[[:upper:]]+ - aBCd BC
[[:xdigit:]]+ - p0f3Cq 0f3C
a[[=b=]]c & abc abc
a[[= &C EBRACK
a[[=b &C EBRACK
a[[=b= &C EBRACK
a[[=b=] &C EBRACK
a[[=b,=]] &C ECOLLATE
a[[=one=]]b & a1b a1b
# complexities
a(((b)))c - abc abc
a(b|(c))d - abd abd
a(b*|c)d - abbd abbd
# just gotta have one DFA-buster, of course
a[ab]{20} - aaaaabaaaabaaaabaaaab aaaaabaaaabaaaabaaaab
# and an inline expansion in case somebody gets tricky
a[ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab] - aaaaabaaaabaaaabaaaab aaaaabaaaabaaaabaaaab
# and in case somebody just slips in an NFA...
a[ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab][ab](wee|week)(knights|night) - aaaaabaaaabaaaabaaaabweeknights aaaaabaaaabaaaabaaaabweeknights
# fish for anomalies as the number of states passes 32
12345678901234567890123456789 - a12345678901234567890123456789b 12345678901234567890123456789
123456789012345678901234567890 - a123456789012345678901234567890b 123456789012345678901234567890
1234567890123456789012345678901 - a1234567890123456789012345678901b 1234567890123456789012345678901
12345678901234567890123456789012 - a12345678901234567890123456789012b 12345678901234567890123456789012
123456789012345678901234567890123 - a123456789012345678901234567890123b 123456789012345678901234567890123
# and one really big one, beyond any plausible word width
1234567890123456789012345678901234567890123456789012345678901234567890 - a1234567890123456789012345678901234567890123456789012345678901234567890b 1234567890123456789012345678901234567890123456789012345678901234567890
# fish for problems as brackets go past 8
[ab][cd][ef][gh][ij][kl][mn] - xacegikmoq acegikm
[ab][cd][ef][gh][ij][kl][mn][op] - xacegikmoq acegikmo
[ab][cd][ef][gh][ij][kl][mn][op][qr] - xacegikmoqy acegikmoq
[ab][cd][ef][gh][ij][kl][mn][op][q] - xacegikmoqy acegikmoq
# subtleties of matching
abc & xabcy abc
a\(b\)?c\1d b acd
aBc i Abc Abc
a[Bc]*d i abBCcd abBCcd
0[[:upper:]]1 &i 0a1 0a1
0[[:lower:]]1 &i 0A1 0A1
a[^b]c &i abc
a[^b]c &i aBc
a[^b]c &i adc adc
[a]b[c] - abc abc
[a]b[a] - aba aba
[abc]b[abc] - abc abc
[abc]b[abd] - abd abd
a(b?c)+d - accd accd
(wee|week)(knights|night) - weeknights weeknights
(we|wee|week|frob)(knights|night|day) - weeknights weeknights
a[bc]d - xyzaaabcaababdacd abd
a[ab]c - aaabc abc
abc s abc abc
a* & b @b
# Let's have some fun -- try to match a C comment.
# first the obvious, which looks okay at first glance...
/\*.*\*/ - /*x*/ /*x*/
# but...
/\*.*\*/ - /*x*/y/*z*/ /*x*/y/*z*/
# okay, we must not match */ inside; try to do that...
/\*([^*]|\*[^/])*\*/ - /*x*/ /*x*/
/\*([^*]|\*[^/])*\*/ - /*x*/y/*z*/ /*x*/
# but...
/\*([^*]|\*[^/])*\*/ - /*x**/y/*z*/ /*x**/y/*z*/
# and a still fancier version, which does it right (I think)...
/\*([^*]|\*+[^*/])*\*+/ - /*x*/ /*x*/
/\*([^*]|\*+[^*/])*\*+/ - /*x*/y/*z*/ /*x*/
/\*([^*]|\*+[^*/])*\*+/ - /*x**/y/*z*/ /*x**/
/\*([^*]|\*+[^*/])*\*+/ - /*x****/y/*z*/ /*x****/
/\*([^*]|\*+[^*/])*\*+/ - /*x**x*/y/*z*/ /*x**x*/
/\*([^*]|\*+[^*/])*\*+/ - /*x***x/y/*z*/ /*x***x/y/*z*/
# subexpressions
a(b)(c)d - abcd abcd b,c
a(((b)))c - abc abc b,b,b
a(b|(c))d - abd abd b,-
a(b*|c|e)d - abbd abbd bb
a(b*|c|e)d - acd acd c
a(b*|c|e)d - ad ad @d
a(b?)c - abc abc b
a(b?)c - ac ac @c
a(b+)c - abc abc b
a(b+)c - abbbc abbbc bbb
a(b*)c - ac ac @c
(a|ab)(bc([de]+)f|cde) - abcdef abcdef a,bcdef,de
# the regression tester only asks for 9 subexpressions
a(b)(c)(d)(e)(f)(g)(h)(i)(j)k - abcdefghijk abcdefghijk b,c,d,e,f,g,h,i,j
a(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)l - abcdefghijkl abcdefghijkl b,c,d,e,f,g,h,i,j,k
a([bc]?)c - abc abc b
a([bc]?)c - ac ac @c
a([bc]+)c - abc abc b
a([bc]+)c - abcc abcc bc
a([bc]+)bc - abcbc abcbc bc
a(bb+|b)b - abb abb b
a(bbb+|bb+|b)b - abb abb b
a(bbb+|bb+|b)b - abbb abbb bb
a(bbb+|bb+|b)bb - abbb abbb b
(.*).* - abcdef abcdef abcdef
##(a*)* - bc @b @b
# do we get the right subexpression when it is used more than once?
a(b|c)*d - ad ad -
a(b|c)*d - abcd abcd c
a(b|c)+d - abd abd b
a(b|c)+d - abcd abcd c
a(b|c?)+d - ad ad @d
a(b|c?)+d - abcd abcd @d
a(b|c){0,0}d - ad ad -
a(b|c){0,1}d - ad ad -
a(b|c){0,1}d - abd abd b
a(b|c){0,2}d - ad ad -
a(b|c){0,2}d - abcd abcd c
a(b|c){0,}d - ad ad -
a(b|c){0,}d - abcd abcd c
a(b|c){1,1}d - abd abd b
a(b|c){1,1}d - acd acd c
a(b|c){1,2}d - abd abd b
a(b|c){1,2}d - abcd abcd c
a(b|c){1,}d - abd abd b
a(b|c){1,}d - abcd abcd c
a(b|c){2,2}d - acbd acbd b
a(b|c){2,2}d - abcd abcd c
a(b|c){2,4}d - abcd abcd c
a(b|c){2,4}d - abcbd abcbd b
a(b|c){2,4}d - abcbcd abcbcd c
a(b|c){2,}d - abcd abcd c
a(b|c){2,}d - abcbd abcbd b
##a(b+|((c)*))+d - abd abd @d,@d,-
##a(b+|((c)*))+d - abcd abcd @d,@d,-
# check out the STARTEND option
[abc] &# a(b)c b
[abc] &# a(d)c
[abc] &# a(bc)d b
[abc] &# a(dc)d c
. &# a()c
b.*c &# b(bc)c bc
b.* &# b(bc)c bc
.*c &# b(bc)c bc
# plain strings, with the NOSPEC flag
abc m abc abc
abc m xabcy abc
abc m xyz
a*b m aba*b a*b
a*b m ab
"" mC EMPTY
# cases involving NULs
aZb & a a
aZb &p a
#aZb &p# (aZb) aZb
aZ*b &p# (ab) ab
#a.b &# (aZb) aZb
#a.* &# (aZb)c aZb
# word boundaries (ick)
[[:<:]]a & a a
[[:<:]]a & ba
[[:<:]]a & -a a
a[[:>:]] & a a
a[[:>:]] & ab
a[[:>:]] & a- a
[[:<:]]a.c[[:>:]] & axcd-dayc-dazce-abc abc
[[:<:]]a.c[[:>:]] & axcd-dayc-dazce-abc-q abc
[[:<:]]a.c[[:>:]] & axc-dayc-dazce-abc axc
[[:<:]]b.c[[:>:]] & a_bxc-byc_d-bzc-q bzc
[[:<:]].x..[[:>:]] & y_xa_-_xb_y-_xc_-axdc _xc_
[[:<:]]a_b[[:>:]] & x_a_b
# past problems, and suspected problems
(A[1])|(A[2])|(A[3])|(A[4])|(A[5])|(A[6])|(A[7])|(A[8])|(A[9])|(A[A]) - A1 A1
abcdefghijklmnop i abcdefghijklmnop abcdefghijklmnop
abcdefghijklmnopqrstuv i abcdefghijklmnopqrstuv abcdefghijklmnopqrstuv
(ALAK)|(ALT[AB])|(CC[123]1)|(CM[123]1)|(GAMC)|(LC[23][EO ])|(SEM[1234])|(SL[ES][12])|(SLWW)|(SLF )|(SLDT)|(VWH[12])|(WH[34][EW])|(WP1[ESN]) - CC11 CC11
CC[13]1|a{21}[23][EO][123][Es][12]a{15}aa[34][EW]aaaaaaa[X]a - CC11 CC11
Char \([a-z0-9_]*\)\[.* b Char xyz[k Char xyz[k xyz
a?b - ab ab
-\{0,1\}[0-9]*$ b -5 -5

View File

@ -1,4 +1,4 @@
#! /bin/sh
#!/bin/sh
#
# Tell them not to be alarmed.

87
gnu/usr.bin/grep/xalloc.h Normal file
View File

@ -0,0 +1,87 @@
/* xalloc.h -- malloc with out-of-memory checking
Copyright (C) 1990-1998, 1999, 2000 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */
#ifndef XALLOC_H_
# define XALLOC_H_
# ifndef PARAMS
# if defined PROTOTYPES || (defined __STDC__ && __STDC__)
# define PARAMS(Args) Args
# else
# define PARAMS(Args) ()
# endif
# endif
# ifndef __attribute__
# if __GNUC__ < 2 || (__GNUC__ == 2 && __GNUC_MINOR__ < 8) || __STRICT_ANSI__
# define __attribute__(x)
# endif
# endif
# ifndef ATTRIBUTE_NORETURN
# define ATTRIBUTE_NORETURN __attribute__ ((__noreturn__))
# endif
/* Exit value when the requested amount of memory is not available.
It is initialized to EXIT_FAILURE, but the caller may set it to
some other value. */
extern int xalloc_exit_failure;
/* If this pointer is non-zero, run the specified function upon each
allocation failure. It is initialized to zero. */
extern void (*xalloc_fail_func) PARAMS ((void));
/* If XALLOC_FAIL_FUNC is undefined or a function that returns, this
message is output. It is translated via gettext.
Its value is "memory exhausted". */
extern char const xalloc_msg_memory_exhausted[];
/* This function is always triggered when memory is exhausted. It is
in charge of honoring the three previous items. This is the
function to call when one wants the program to die because of a
memory allocation failure. */
extern void xalloc_die PARAMS ((void)) ATTRIBUTE_NORETURN;
void *xmalloc PARAMS ((size_t n));
void *xcalloc PARAMS ((size_t n, size_t s));
void *xrealloc PARAMS ((void *p, size_t n));
char *xstrdup PARAMS ((const char *str));
# define XMALLOC(Type, N_items) ((Type *) xmalloc (sizeof (Type) * (N_items)))
# define XCALLOC(Type, N_items) ((Type *) xcalloc (sizeof (Type), (N_items)))
# define XREALLOC(Ptr, Type, N_items) \
((Type *) xrealloc ((void *) (Ptr), sizeof (Type) * (N_items)))
/* Declare and alloc memory for VAR of type TYPE. */
# define NEW(Type, Var) Type *(Var) = XMALLOC (Type, 1)
/* Free VAR only if non NULL. */
# define XFREE(Var) \
do { \
if (Var) \
free (Var); \
} while (0)
/* Return a pointer to a malloc'ed copy of the array SRC of NUM elements. */
# define CCLONE(Src, Num) \
(memcpy (xmalloc (sizeof (*Src) * (Num)), (Src), sizeof (*Src) * (Num)))
/* Return a malloc'ed copy of SRC. */
# define CLONE(Src) CCLONE (Src, 1)
#endif /* !XALLOC_H_ */

116
gnu/usr.bin/grep/xmalloc.c Normal file
View File

@ -0,0 +1,116 @@
/* xmalloc.c -- malloc with out of memory checking
Copyright (C) 1990-1999, 2000 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */
#if HAVE_CONFIG_H
# include <config.h>
#endif
#include <sys/types.h>
#if STDC_HEADERS
# include <stdlib.h>
#else
void *calloc ();
void *malloc ();
void *realloc ();
void free ();
#endif
#if ENABLE_NLS
# include <libintl.h>
# define _(Text) gettext (Text)
#else
# define textdomain(Domain)
# define _(Text) Text
#endif
#define N_(Text) Text
#include "error.h"
#include "xalloc.h"
#ifndef EXIT_FAILURE
# define EXIT_FAILURE 1
#endif
#ifndef HAVE_DONE_WORKING_MALLOC_CHECK
"you must run the autoconf test for a properly working malloc -- see malloc.m4"
#endif
#ifndef HAVE_DONE_WORKING_REALLOC_CHECK
"you must run the autoconf test for a properly working realloc --see realloc.m4"
#endif
/* Exit value when the requested amount of memory is not available.
The caller may set it to some other value. */
int xalloc_exit_failure = EXIT_FAILURE;
/* If non NULL, call this function when memory is exhausted. */
void (*xalloc_fail_func) PARAMS ((void)) = 0;
/* If XALLOC_FAIL_FUNC is NULL, or does return, display this message
before exiting when memory is exhausted. Goes through gettext. */
char const xalloc_msg_memory_exhausted[] = N_("memory exhausted");
void
xalloc_die (void)
{
if (xalloc_fail_func)
(*xalloc_fail_func) ();
error (xalloc_exit_failure, 0, "%s", _(xalloc_msg_memory_exhausted));
/* The `noreturn' cannot be given to error, since it may return if
its first argument is 0. To help compilers understand the
xalloc_die does terminate, call exit. */
exit (EXIT_FAILURE);
}
/* Allocate N bytes of memory dynamically, with error checking. */
void *
xmalloc (size_t n)
{
void *p;
p = malloc (n);
if (p == 0)
xalloc_die ();
return p;
}
/* Change the size of an allocated block of memory P to N bytes,
with error checking. */
void *
xrealloc (void *p, size_t n)
{
p = realloc (p, n);
if (p == 0)
xalloc_die ();
return p;
}
/* Allocate memory for N elements of S bytes, with error checking. */
void *
xcalloc (size_t n, size_t s)
{
void *p;
p = calloc (n, s);
if (p == 0)
xalloc_die ();
return p;
}

282
gnu/usr.bin/grep/xstrtol.c Normal file
View File

@ -0,0 +1,282 @@
/* A more useful interface to strtol.
Copyright (C) 1995, 1996, 1998-2000 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */
/* Written by Jim Meyering. */
#if HAVE_CONFIG_H
# include <config.h>
#endif
#ifndef __strtol
# define __strtol strtol
# define __strtol_t long int
# define __xstrtol xstrtol
#endif
/* Some pre-ANSI implementations (e.g. SunOS 4)
need stderr defined if assertion checking is enabled. */
#include <stdio.h>
#if STDC_HEADERS
# include <stdlib.h>
#endif
#if HAVE_STRING_H
# include <string.h>
#else
# include <strings.h>
# ifndef strchr
# define strchr index
# endif
#endif
#include <assert.h>
#include <ctype.h>
#include <errno.h>
#ifndef errno
extern int errno;
#endif
#if HAVE_LIMITS_H
# include <limits.h>
#endif
#ifndef CHAR_BIT
# define CHAR_BIT 8
#endif
/* The extra casts work around common compiler bugs. */
#define TYPE_SIGNED(t) (! ((t) 0 < (t) -1))
/* The outer cast is needed to work around a bug in Cray C 5.0.3.0.
It is necessary at least when t == time_t. */
#define TYPE_MINIMUM(t) ((t) (TYPE_SIGNED (t) \
? ~ (t) 0 << (sizeof (t) * CHAR_BIT - 1) : (t) 0))
#define TYPE_MAXIMUM(t) (~ (t) 0 - TYPE_MINIMUM (t))
#if defined (STDC_HEADERS) || (!defined (isascii) && !defined (HAVE_ISASCII))
# define IN_CTYPE_DOMAIN(c) 1
#else
# define IN_CTYPE_DOMAIN(c) isascii(c)
#endif
#define ISSPACE(c) (IN_CTYPE_DOMAIN (c) && isspace (c))
#include "xstrtol.h"
#ifndef strtol
long int strtol ();
#endif
#ifndef strtoul
unsigned long int strtoul ();
#endif
#ifndef strtoumax
uintmax_t strtoumax ();
#endif
static int
bkm_scale (__strtol_t *x, int scale_factor)
{
__strtol_t product = *x * scale_factor;
if (*x != product / scale_factor)
return 1;
*x = product;
return 0;
}
static int
bkm_scale_by_power (__strtol_t *x, int base, int power)
{
while (power--)
if (bkm_scale (x, base))
return 1;
return 0;
}
/* FIXME: comment. */
strtol_error
__xstrtol (const char *s, char **ptr, int strtol_base,
__strtol_t *val, const char *valid_suffixes)
{
char *t_ptr;
char **p;
__strtol_t tmp;
assert (0 <= strtol_base && strtol_base <= 36);
p = (ptr ? ptr : &t_ptr);
if (! TYPE_SIGNED (__strtol_t))
{
const char *q = s;
while (ISSPACE ((unsigned char) *q))
++q;
if (*q == '-')
return LONGINT_INVALID;
}
errno = 0;
tmp = __strtol (s, p, strtol_base);
if (errno != 0)
return LONGINT_OVERFLOW;
if (*p == s)
return LONGINT_INVALID;
/* Let valid_suffixes == NULL mean `allow any suffix'. */
/* FIXME: update all callers except the ones that allow suffixes
after the number, changing last parameter NULL to `""'. */
if (!valid_suffixes)
{
*val = tmp;
return LONGINT_OK;
}
if (**p != '\0')
{
int base = 1024;
int suffixes = 1;
int overflow;
if (!strchr (valid_suffixes, **p))
{
*val = tmp;
return LONGINT_INVALID_SUFFIX_CHAR;
}
if (strchr (valid_suffixes, '0'))
{
/* The ``valid suffix'' '0' is a special flag meaning that
an optional second suffix is allowed, which can change
the base, e.g. "100MD" for 100 megabytes decimal. */
switch (p[0][1])
{
case 'B':
suffixes++;
break;
case 'D':
base = 1000;
suffixes++;
break;
}
}
switch (**p)
{
case 'b':
overflow = bkm_scale (&tmp, 512);
break;
case 'B':
overflow = bkm_scale (&tmp, 1024);
break;
case 'c':
overflow = 0;
break;
case 'E': /* Exa */
overflow = bkm_scale_by_power (&tmp, base, 6);
break;
case 'G': /* Giga */
overflow = bkm_scale_by_power (&tmp, base, 3);
break;
case 'k': /* kilo */
overflow = bkm_scale_by_power (&tmp, base, 1);
break;
case 'M': /* Mega */
case 'm': /* 'm' is undocumented; for backward compatibility only */
overflow = bkm_scale_by_power (&tmp, base, 2);
break;
case 'P': /* Peta */
overflow = bkm_scale_by_power (&tmp, base, 5);
break;
case 'T': /* Tera */
overflow = bkm_scale_by_power (&tmp, base, 4);
break;
case 'w':
overflow = bkm_scale (&tmp, 2);
break;
case 'Y': /* Yotta */
overflow = bkm_scale_by_power (&tmp, base, 8);
break;
case 'Z': /* Zetta */
overflow = bkm_scale_by_power (&tmp, base, 7);
break;
default:
*val = tmp;
return LONGINT_INVALID_SUFFIX_CHAR;
break;
}
if (overflow)
return LONGINT_OVERFLOW;
(*p) += suffixes;
}
*val = tmp;
return LONGINT_OK;
}
#ifdef TESTING_XSTRTO
# include <stdio.h>
# include "error.h"
char *program_name;
int
main (int argc, char** argv)
{
strtol_error s_err;
int i;
program_name = argv[0];
for (i=1; i<argc; i++)
{
char *p;
__strtol_t val;
s_err = __xstrtol (argv[i], &p, 0, &val, "bckmw");
if (s_err == LONGINT_OK)
{
printf ("%s->%lu (%s)\n", argv[i], val, p);
}
else
{
STRTOL_FATAL_ERROR (argv[i], "arg", s_err);
}
}
exit (0);
}
#endif /* TESTING_XSTRTO */

View File

@ -0,0 +1,64 @@
#ifndef XSTRTOL_H_
# define XSTRTOL_H_ 1
# if HAVE_INTTYPES_H
# include <inttypes.h> /* for uintmax_t */
# endif
# ifndef PARAMS
# if defined PROTOTYPES || (defined __STDC__ && __STDC__)
# define PARAMS(Args) Args
# else
# define PARAMS(Args) ()
# endif
# endif
# ifndef _STRTOL_ERROR
enum strtol_error
{
LONGINT_OK, LONGINT_INVALID, LONGINT_INVALID_SUFFIX_CHAR, LONGINT_OVERFLOW
};
typedef enum strtol_error strtol_error;
# endif
# define _DECLARE_XSTRTOL(name, type) \
strtol_error \
name PARAMS ((const char *s, char **ptr, int base, \
type *val, const char *valid_suffixes));
_DECLARE_XSTRTOL (xstrtol, long int)
_DECLARE_XSTRTOL (xstrtoul, unsigned long int)
_DECLARE_XSTRTOL (xstrtoumax, uintmax_t)
# define _STRTOL_ERROR(Exit_code, Str, Argument_type_string, Err) \
do \
{ \
switch ((Err)) \
{ \
case LONGINT_OK: \
abort (); \
\
case LONGINT_INVALID: \
error ((Exit_code), 0, "invalid %s `%s'", \
(Argument_type_string), (Str)); \
break; \
\
case LONGINT_INVALID_SUFFIX_CHAR: \
error ((Exit_code), 0, "invalid character following %s `%s'", \
(Argument_type_string), (Str)); \
break; \
\
case LONGINT_OVERFLOW: \
error ((Exit_code), 0, "%s `%s' too large", \
(Argument_type_string), (Str)); \
break; \
} \
} \
while (0)
# define STRTOL_FATAL_ERROR(Str, Argument_type_string, Err) \
_STRTOL_ERROR (2, Str, Argument_type_string, Err)
# define STRTOL_FAIL_WARN(Str, Argument_type_string, Err) \
_STRTOL_ERROR (0, Str, Argument_type_string, Err)
#endif /* not XSTRTOL_H_ */

View File

@ -0,0 +1,31 @@
/* xstrtoumax.c -- A more useful interface to strtoumax.
Copyright 1999 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software Foundation,
Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. */
/* Written by Paul Eggert. */
#if HAVE_CONFIG_H
# include <config.h>
#endif
#if HAVE_INTTYPES_H
# include <inttypes.h>
#endif
#define __strtol strtoumax
#define __strtol_t uintmax_t
#define __xstrtol xstrtoumax
#include "xstrtol.c"