This commit was manufactured by cvs2svn to create tag 'grep_2_0'.

svn path=/cvs2svn/tags/grep_2_0/; revision=95
1993-07-06 18:45:27 +00:00 · 1993-07-06 18:45:27 +00:00 · c0712724bc · 2020-12-20 02:59:44 +00:00
commit c0712724bc
parent 717f769197
23 changed files with 10095 additions and 3928 deletions
--- a/gnu/usr.bin/grep/AUTHORS
+++ b/gnu/usr.bin/grep/AUTHORS
@ -0,0 +1,29 @@
+Mike Haertel wrote the main program and the dfa and kwset matchers.
+
+Arthur David Olson contributed the heuristics for finding fixed substrings
+at the end of dfa.c.
+
+Richard Stallman and Karl Berry wrote the regex backtracking matcher.
+
+Henry Spencer wrote the original test suite from which grep's was derived.
+
+Scott Anderson invented the Khadafy test.
+
+David MacKenzie wrote the automatic configuration software use to
+produce the configure script.
+
+Authors of the replacements for standard library routines are identified
+in the corresponding source files.
+
+The idea of using Boyer-Moore type algorithms to quickly filter out
+non-matching text before calling the regexp matcher was originally due
+to James Woods.  He also contributed some code to early versions of
+GNU grep.
+
+Finally, I would like to thank Andrew Hume for many fascinating discussions
+of string searching issues over the years.  Hume & Sunday's excellent
+paper on fast string searching (AT&T Bell Laboratories CSTR #156)
+describes some of the history of the subject, as well as providing
+exhaustive performance analysis of various implementation alternatives.
+The inner loop of GNU grep is similar to Hume & Sunday's recommended
+"Tuned Boyer Moore" inner loop.
--- a/gnu/usr.bin/grep/Makefile
+++ b/gnu/usr.bin/grep/Makefile
@ -1,7 +1,13 @@
-
 PROG=	grep
-SRCS=	dfa.c regex.o grep.o
-CFLAGS+=-DSTDC_HEADERS=1 -DHAVE_UNISTD_H=1
-MLINKS= grep.1 egrep.1
+SRCS=	dfa.c grep.c getopt.c kwset.c obstack.c regex.c search.c
+
+CFLAGS+=-DGREP -DHAVE_STRING_H=1 -DHAVE_SYS_PARAM_H=1 -DHAVE_UNISTD_H=1 \
+	-DHAVE_GETPAGESIZE=1 -DHAVE_MEMCHR=1 -DHAVE_STRERROR=1 \
+	-DHAVE_VALLOC=1
+
+
+#check: ${.CURDIR}/grep
+check: all
+	awk sh ${.CURDIR}/tests/check.sh ${.CURDIR}/tests

 .include <bsd.prog.mk>
--- a/gnu/usr.bin/grep/NEWS
+++ b/gnu/usr.bin/grep/NEWS
@ -0,0 +1,35 @@
+Version 2.0:
+
+The most important user visible change is that egrep and fgrep have
+disappeared as separate programs into the single grep program mandated
+by POSIX 1003.2.  New options -G, -E, and -F have been added,
+selecting grep, egrep, and fgrep behavior respectively.  For
+compatibility with historical practice, hard links named egrep and
+fgrep are also provided.  See the manual page for details.
+
+In addition, the regular expression facilities described in Posix
+draft 11.2 are now supported, except for internationalization features
+related to locale-dependent collating sequence information.
+
+There is a new option, -L, which is like -l except it lists
+files which don't contain matches.  The reason this option was
+added is because '-l -v' doesn't do what you expect.
+
+Performance has been improved; the amount of improvement is platform
+dependent, but (for example) grep 2.0 typically runs at least 30% faster
+than grep 1.6 on a DECstation using the MIPS compiler.  Where possible,
+grep now uses mmap() for file input; on a Sun 4 running SunOS 4.1 this
+may cut system time by as much as half, for a total reduction in running
+time by nearly 50%.  On machines that don't use mmap(), the buffering
+code has been rewritten to choose more favorable alignments and buffer
+sizes for read().
+
+Portability has been substantially cleaned up, and an automatic
+configure script is now provided.
+
+The internals have changed in ways too numerous to mention.
+People brave enough to reuse the DFA matcher in other programs
+will now have their bravery amply "rewarded", for the interface
+to that file has been completely changed.  Some changes were
+necessary to track the evolution of the regex package, and since
+I was changing it anyway I decided to do a general cleanup.
--- a/gnu/usr.bin/grep/PROJECTS
+++ b/gnu/usr.bin/grep/PROJECTS
@ -0,0 +1,15 @@
+Write Texinfo documentation for grep.  The manual page would be a good
+place to start, but Info documents are also supposed to contain a
+tutorial and examples.
+
+Fix the DFA matcher to never use exponential space.  (Fortunately, these
+cases are rare.)
+
+Improve the performance of the regex backtracking matcher.  This matcher
+is agonizingly slow, and is responsible for grep sometimes being slower
+than Unix grep when backreferences are used.
+
+Provide support for the Posix [= =] and [. .] constructs.  This is
+difficult because it requires locale-dependent details of the character
+set and collating sequence, but Posix does not standardize any method
+for accessing this information!
--- a/gnu/usr.bin/grep/README
+++ b/gnu/usr.bin/grep/README
@ -1,70 +1,28 @@
-This README documents GNU e?grep version 1.6.  All bugs reported for
-previous versions have been fixed.
+This is GNU grep 2.0, the "fastest grep in the west" (we hope).  All
+bugs reported in previous releases have been fixed.  Many exciting new
+bugs have probably been introduced in this major revision.

-See the file INSTALL for compilation and installation instructions.
-
-Send bug reports to bug-gnu-utils@prep.ai.mit.edu.
-
-GNU e?grep is provided "as is" with no warranty.  The exact terms
+GNU grep is provided "as is" with no warranty.  The exact terms
 under which you may use and (re)distribute this program are detailed
 in the GNU General Public License, in the file COPYING.

-GNU e?grep is based on a fast lazy-state deterministic matcher (about
+GNU grep is based on a fast lazy-state deterministic matcher (about
 twice as fast as stock Unix egrep) hybridized with a Boyer-Moore-Gosper
 search for a fixed string that eliminates impossible text from being
 considered by the full regexp matcher without necessarily having to
 look at every character.  The result is typically many times faster
 than Unix grep or egrep.  (Regular expressions containing backreferencing
-may run more slowly, however.)
+will run more slowly, however.)

-GNU e?grep is brought to you by the efforts of several people:
+See the file AUTHORS for a list of authors and other contributors.

-	Mike Haertel wrote the deterministic regexp code and the bulk
-	of the program.
+See the file INSTALL for compilation and installation instructions.

-	James A. Woods is responsible for the hybridized search strategy
-	of using Boyer-Moore-Gosper fixed-string search as a filter
-	before calling the general regexp matcher.
+See the file MANIFEST for a list of files in this distribution.

-	Arthur David Olson contributed code that finds fixed strings for
-	the aforementioned BMG search for a large class of regexps.
+See the file NEWS for a description of major changes in this release.

-	Richard Stallman wrote the backtracking regexp matcher that is
-	used for \<digit> backreferences, as well as the getopt that
-	is provided for 4.2BSD sites.  The backtracking matcher was
-	originally written for GNU Emacs.
+See the file PROJECTS if you want to be mentioned in AUTHORS.

-	D. A. Gwyn wrote the C alloca emulation that is provided so
-	System V machines can run this program.  (Alloca is used only
-	by RMS' backtracking matcher, and then only rarely, so there
-	is no loss if your machine doesn't have a "real" alloca.)
-
-	Scott Anderson and Henry Spencer designed the regression tests
-	used in the "regress" script.
-
-	Paul Placeway wrote the manual page, based on this README.
-
-If you are interested in improving this program, you may wish to try
-any of the following:
-
-1.  Replace the fast search loop with a faster search loop.
-    There are several things that could be improved, the most notable
-    of which would be to calculate a minimal delta2 to use.
-
-2.  Make backreferencing \<digit> faster.  Right now, backreferencing is
-    handled by calling the Emacs backtracking matcher to verify the partial
-    match.  This is slow; if the DFA routines could handle backreferencing
-    themselves a speedup on the order of three to four times might occur
-    in those cases where the backtracking matcher is called to verify nearly
-    every line.  Also, some portability problems due to the inclusion of the
-    emacs matcher would be solved because it could then be eliminated.
-    Note that expressions with backreferencing are not true regular
-    expressions, and thus are not equivalent to any DFA.  So this is hard.
-
-3.  Handle POSIX style regexps.  I'm not sure if this could be called an
-    improvement; some of the things on regexps in the POSIX draft I have
-    seen are pretty sickening.  But it would be useful in the interests of
-    conforming to the standard.
-
-4.  Replace the main driver program grep.c with the much cleaner main driver
-    program used in GNU fgrep.
+Send bug reports to bug-gnu-utils@prep.ai.mit.edu.  Be sure to
+include the word "grep" in your Subject: header field.
--- a/gnu/usr.bin/grep/dfa.c
+++ b/gnu/usr.bin/grep/dfa.c
--- a/gnu/usr.bin/grep/dfa.h
+++ b/gnu/usr.bin/grep/dfa.h
@ -16,210 +16,115 @@
   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  */

 /* Written June, 1988 by Mike Haertel */
-
-#ifdef STDC_HEADERS

-#include <stddef.h>
-#include <stdlib.h>
-
-#else /* !STDC_HEADERS */
-
-#define const
-#include <sys/types.h>		/* For size_t.  */
-extern char *calloc(), *malloc(), *realloc();
-extern void free();
-
-#ifndef NULL
-#define NULL 0
-#endif
-
-#endif /* ! STDC_HEADERS */
-
-#include <ctype.h>
-#ifndef isascii
-#define ISALNUM(c) isalnum(c)
-#define ISALPHA(c) isalpha(c)
-#define ISUPPER(c) isupper(c)
-#define ISLOWER(c) islower(c)
-#else
-#define ISALNUM(c) (isascii(c) && isalnum(c))
-#define ISALPHA(c) (isascii(c) && isalpha(c))
-#define ISUPPER(c) (isascii(c) && isupper(c))
-#define ISLOWER(c) (isascii(c) && islower(c))
-#endif
-
-/* 1 means plain parentheses serve as grouping, and backslash
-     parentheses are needed for literal searching.
-   0 means backslash-parentheses are grouping, and plain parentheses
-     are for literal searching.  */
-#define RE_NO_BK_PARENS 1
-
-/* 1 means plain | serves as the "or"-operator, and \| is a literal.
-   0 means \| serves as the "or"-operator, and | is a literal.  */
-#define RE_NO_BK_VBAR 2
-
-/* 0 means plain + or ? serves as an operator, and \+, \? are literals.
-   1 means \+, \? are operators and plain +, ? are literals.  */
-#define RE_BK_PLUS_QM 4
-
-/* 1 means | binds tighter than ^ or $.
-   0 means the contrary.  */
-#define RE_TIGHT_VBAR 8
-
-/* 1 means treat \n as an _OR operator
-   0 means treat it as a normal character */
-#define RE_NEWLINE_OR 16
-
-/* 0 means that a special characters (such as *, ^, and $) always have
-     their special meaning regardless of the surrounding context.
-   1 means that special characters may act as normal characters in some
-     contexts.  Specifically, this applies to:
-	^ - only special at the beginning, or after ( or |
-	$ - only special at the end, or before ) or |
-	*, +, ? - only special when not after the beginning, (, or | */
-#define RE_CONTEXT_INDEP_OPS 32
-
-/* Now define combinations of bits for the standard possibilities.  */
-#define RE_SYNTAX_AWK (RE_NO_BK_PARENS | RE_NO_BK_VBAR | RE_CONTEXT_INDEP_OPS)
-#define RE_SYNTAX_EGREP (RE_SYNTAX_AWK | RE_NEWLINE_OR)
-#define RE_SYNTAX_GREP (RE_BK_PLUS_QM | RE_NEWLINE_OR)
-#define RE_SYNTAX_EMACS 0
+/* FIXME:
+   2.  We should not export so much of the DFA internals.
+   In addition to clobbering modularity, we eat up valuable
+   name space. */

 /* Number of bits in an unsigned char. */
 #define CHARBITS 8

 /* First integer value that is greater than any character code. */
-#define _NOTCHAR (1 << CHARBITS)
+#define NOTCHAR (1 << CHARBITS)

 /* INTBITS need not be exact, just a lower bound. */
 #define INTBITS (CHARBITS * sizeof (int))

 /* Number of ints required to hold a bit for every character. */
-#define _CHARSET_INTS ((_NOTCHAR + INTBITS - 1) / INTBITS)
+#define CHARCLASS_INTS ((NOTCHAR + INTBITS - 1) / INTBITS)

 /* Sets of unsigned characters are stored as bit vectors in arrays of ints. */
-typedef int _charset[_CHARSET_INTS];
+typedef int charclass[CHARCLASS_INTS];

 /* The regexp is parsed into an array of tokens in postfix form.  Some tokens
   are operators and others are terminal symbols.  Most (but not all) of these
   codes are returned by the lexical analyzer. */
-#if __STDC__

 typedef enum
 {
-  _END = -1,			/* _END is a terminal symbol that matches the
-				   end of input; any value of _END or less in
+  END = -1,			/* END is a terminal symbol that matches the
+				   end of input; any value of END or less in
 				   the parse tree is such a symbol.  Accepting
 				   states of the DFA are those that would have
-				   a transition on _END. */
+				   a transition on END. */

  /* Ordinary character values are terminal symbols that match themselves. */

-  _EMPTY = _NOTCHAR,		/* _EMPTY is a terminal symbol that matches
+  EMPTY = NOTCHAR,		/* EMPTY is a terminal symbol that matches
 				   the empty string. */

-  _BACKREF,			/* _BACKREF is generated by \<digit>; it
+  BACKREF,			/* BACKREF is generated by \<digit>; it
 				   it not completely handled.  If the scanner
 				   detects a transition on backref, it returns
 				   a kind of "semi-success" indicating that
 				   the match will have to be verified with
 				   a backtracking matcher. */

-  _BEGLINE,			/* _BEGLINE is a terminal symbol that matches
+  BEGLINE,			/* BEGLINE is a terminal symbol that matches
 				   the empty string if it is at the beginning
 				   of a line. */

-  _ALLBEGLINE,			/* _ALLBEGLINE is a terminal symbol that
-				   matches the empty string if it is at the
-				   beginning of a line; _ALLBEGLINE applies
-				   to the entire regexp and can only occur
-				   as the first token thereof.  _ALLBEGLINE
-				   never appears in the parse tree; a _BEGLINE
-				   is prepended with _CAT to the entire
-				   regexp instead. */
-
-  _ENDLINE,			/* _ENDLINE is a terminal symbol that matches
+  ENDLINE,			/* ENDLINE is a terminal symbol that matches
 				   the empty string if it is at the end of
 				   a line. */

-  _ALLENDLINE,			/* _ALLENDLINE is to _ENDLINE as _ALLBEGLINE
-				   is to _BEGLINE. */
-
-  _BEGWORD,			/* _BEGWORD is a terminal symbol that matches
+  BEGWORD,			/* BEGWORD is a terminal symbol that matches
 				   the empty string if it is at the beginning
 				   of a word. */

-  _ENDWORD,			/* _ENDWORD is a terminal symbol that matches
+  ENDWORD,			/* ENDWORD is a terminal symbol that matches
 				   the empty string if it is at the end of
 				   a word. */

-  _LIMWORD,			/* _LIMWORD is a terminal symbol that matches
+  LIMWORD,			/* LIMWORD is a terminal symbol that matches
 				   the empty string if it is at the beginning
 				   or the end of a word. */

-  _NOTLIMWORD,			/* _NOTLIMWORD is a terminal symbol that
+  NOTLIMWORD,			/* NOTLIMWORD is a terminal symbol that
 				   matches the empty string if it is not at
 				   the beginning or end of a word. */

-  _QMARK,			/* _QMARK is an operator of one argument that
+  QMARK,			/* QMARK is an operator of one argument that
 				   matches zero or one occurences of its
 				   argument. */

-  _STAR,			/* _STAR is an operator of one argument that
+  STAR,				/* STAR is an operator of one argument that
 				   matches the Kleene closure (zero or more
 				   occurrences) of its argument. */

-  _PLUS,			/* _PLUS is an operator of one argument that
+  PLUS,				/* PLUS is an operator of one argument that
 				   matches the positive closure (one or more
 				   occurrences) of its argument. */

-  _CAT,				/* _CAT is an operator of two arguments that
+  REPMN,			/* REPMN is a lexical token corresponding
+				   to the {m,n} construct.  REPMN never
+				   appears in the compiled token vector. */
+
+  CAT,				/* CAT is an operator of two arguments that
 				   matches the concatenation of its
-				   arguments.  _CAT is never returned by the
+				   arguments.  CAT is never returned by the
 				   lexical analyzer. */

-  _OR,				/* _OR is an operator of two arguments that
+  OR,				/* OR is an operator of two arguments that
 				   matches either of its arguments. */

-  _LPAREN,			/* _LPAREN never appears in the parse tree,
+  ORTOP,			/* OR at the toplevel in the parse tree.
+				   This is used for a boyer-moore heuristic. */
+
+  LPAREN,			/* LPAREN never appears in the parse tree,
 				   it is only a lexeme. */

-  _RPAREN,			/* _RPAREN never appears in the parse tree. */
+  RPAREN,			/* RPAREN never appears in the parse tree. */

-  _SET				/* _SET and (and any value greater) is a
+  CSET				/* CSET and (and any value greater) is a
 				   terminal symbol that matches any of a
 				   class of characters. */
-} _token;
+} token;

-#else /* ! __STDC__ */
-
-typedef short _token;
-
-#define _END -1
-#define _EMPTY _NOTCHAR
-#define _BACKREF (_EMPTY + 1)
-#define _BEGLINE (_EMPTY + 2)
-#define _ALLBEGLINE (_EMPTY + 3)
-#define _ENDLINE (_EMPTY + 4)
-#define _ALLENDLINE (_EMPTY + 5)
-#define _BEGWORD (_EMPTY + 6)
-#define _ENDWORD (_EMPTY + 7)
-#define _LIMWORD (_EMPTY + 8)
-#define _NOTLIMWORD (_EMPTY + 9)
-#define _QMARK (_EMPTY + 10)
-#define _STAR (_EMPTY + 11)
-#define _PLUS (_EMPTY + 12)
-#define _CAT (_EMPTY + 13)
-#define _OR (_EMPTY + 14)
-#define _LPAREN (_EMPTY + 15)
-#define _RPAREN (_EMPTY + 16)
-#define _SET (_EMPTY + 17)
-
-#endif /* ! __STDC__ */
-
-/* Sets are stored in an array in the compiled regexp; the index of the
-   array corresponding to a given set token is given by _SET_INDEX(t). */
-#define _SET_INDEX(t) ((t) - _SET)
+/* Sets are stored in an array in the compiled dfa; the index of the
+   array corresponding to a given set token is given by SET_INDEX(t). */
+#define SET_INDEX(t) ((t) - CSET)

 /* Sometimes characters can only be matched depending on the surrounding
   context.  Such context decisions depend on what the previous character
@ -239,36 +144,36 @@ typedef short _token;

   Word-constituent characters are those that satisfy isalnum().

-   The macro _SUCCEEDS_IN_CONTEXT determines whether a a given constraint
+   The macro SUCCEEDS_IN_CONTEXT determines whether a a given constraint
   succeeds in a particular context.  Prevn is true if the previous character
   was a newline, currn is true if the lookahead character is a newline.
   Prevl and currl similarly depend upon whether the previous and current
   characters are word-constituent letters. */
-#define _MATCHES_NEWLINE_CONTEXT(constraint, prevn, currn) \
-  ((constraint) & 1 << ((prevn) ? 2 : 0) + ((currn) ? 1 : 0) + 4)
-#define _MATCHES_LETTER_CONTEXT(constraint, prevl, currl) \
-  ((constraint) & 1 << ((prevl) ? 2 : 0) + ((currl) ? 1 : 0))
-#define _SUCCEEDS_IN_CONTEXT(constraint, prevn, currn, prevl, currl) \
-  (_MATCHES_NEWLINE_CONTEXT(constraint, prevn, currn)		     \
-   && _MATCHES_LETTER_CONTEXT(constraint, prevl, currl))
+#define MATCHES_NEWLINE_CONTEXT(constraint, prevn, currn) \
+  ((constraint) & 1 << (((prevn) ? 2 : 0) + ((currn) ? 1 : 0) + 4))
+#define MATCHES_LETTER_CONTEXT(constraint, prevl, currl) \
+  ((constraint) & 1 << (((prevl) ? 2 : 0) + ((currl) ? 1 : 0)))
+#define SUCCEEDS_IN_CONTEXT(constraint, prevn, currn, prevl, currl) \
+  (MATCHES_NEWLINE_CONTEXT(constraint, prevn, currn)		     \
+   && MATCHES_LETTER_CONTEXT(constraint, prevl, currl))

 /* The following macros give information about what a constraint depends on. */
-#define _PREV_NEWLINE_DEPENDENT(constraint) \
+#define PREV_NEWLINE_DEPENDENT(constraint) \
  (((constraint) & 0xc0) >> 2 != ((constraint) & 0x30))
-#define _PREV_LETTER_DEPENDENT(constraint) \
+#define PREV_LETTER_DEPENDENT(constraint) \
  (((constraint) & 0x0c) >> 2 != ((constraint) & 0x03))

 /* Tokens that match the empty string subject to some constraint actually
   work by applying that constraint to determine what may follow them,
   taking into account what has gone before.  The following values are
   the constraints corresponding to the special tokens previously defined. */
-#define _NO_CONSTRAINT 0xff
-#define _BEGLINE_CONSTRAINT 0xcf
-#define _ENDLINE_CONSTRAINT 0xaf
-#define _BEGWORD_CONSTRAINT 0xf2
-#define _ENDWORD_CONSTRAINT 0xf4
-#define _LIMWORD_CONSTRAINT 0xf6
-#define _NOTLIMWORD_CONSTRAINT 0xf9
+#define NO_CONSTRAINT 0xff
+#define BEGLINE_CONSTRAINT 0xcf
+#define ENDLINE_CONSTRAINT 0xaf
+#define BEGWORD_CONSTRAINT 0xf2
+#define ENDWORD_CONSTRAINT 0xf4
+#define LIMWORD_CONSTRAINT 0xf6
+#define NOTLIMWORD_CONSTRAINT 0xf9

 /* States of the recognizer correspond to sets of positions in the parse
   tree, together with the constraints under which they may be matched.
@ -278,44 +183,48 @@ typedef struct
 {
  unsigned index;		/* Index into the parse array. */
  unsigned constraint;		/* Constraint for matching this position. */
-} _position;
+} position;

 /* Sets of positions are stored as arrays. */
 typedef struct
 {
-  _position *elems;		/* Elements of this position set. */
+  position *elems;		/* Elements of this position set. */
  int nelem;			/* Number of elements in this set. */
-} _position_set;
+} position_set;

-/* A state of the regexp consists of a set of positions, some flags,
+/* A state of the dfa consists of a set of positions, some flags,
   and the token value of the lowest-numbered position of the state that
-   contains an _END token. */
+   contains an END token. */
 typedef struct
 {
  int hash;			/* Hash of the positions of this state. */
-  _position_set elems;		/* Positions this state could match. */
+  position_set elems;		/* Positions this state could match. */
  char newline;			/* True if previous state matched newline. */
  char letter;			/* True if previous state matched a letter. */
  char backref;			/* True if this state matches a \<digit>. */
  unsigned char constraint;	/* Constraint for this state to accept. */
-  int first_end;		/* Token value of the first _END in elems. */
-} _dfa_state;
+  int first_end;		/* Token value of the first END in elems. */
+} dfa_state;

-/* If an r.e. is at most MUST_MAX characters long, we look for a string which
-   must appear in it; whatever's found is dropped into the struct reg. */
-
-#define MUST_MAX	50
+/* Element of a list of strings, at least one of which is known to
+   appear in any R.E. matching the DFA. */
+struct dfamust
+{
+  int exact;
+  char *must;
+  struct dfamust *next;
+};

 /* A compiled regular expression. */
-struct regexp
+struct dfa
 {
  /* Stuff built by the scanner. */
-  _charset *charsets;		/* Array of character sets for _SET tokens. */
-  int cindex;			/* Index for adding new charsets. */
-  int calloc;			/* Number of charsets currently allocated. */
+  charclass *charclasses;	/* Array of character sets for CSET tokens. */
+  int cindex;			/* Index for adding new charclasses. */
+  int calloc;			/* Number of charclasses currently allocated. */

  /* Stuff built by the parser. */
-  _token *tokens;		/* Postfix parse array. */
+  token *tokens;		/* Postfix parse array. */
  int tindex;			/* Index for adding new tokens. */
  int talloc;			/* Number of tokens currently allocated. */
  int depth;			/* Depth required of an evaluation stack
@ -323,15 +232,15 @@ struct regexp
 				   parse tree. */
  int nleaves;			/* Number of leaves on the parse tree. */
  int nregexps;			/* Count of parallel regexps being built
-				   with regparse(). */
+				   with dfaparse(). */

  /* Stuff owned by the state builder. */
-  _dfa_state *states;		/* States of the regexp. */
+  dfa_state *states;		/* States of the dfa. */
  int sindex;			/* Index for adding new states. */
  int salloc;			/* Number of states currently allocated. */

  /* Stuff built by the structure analyzer. */
-  _position_set *follows;	/* Array of follow sets, indexed by position
+  position_set *follows;	/* Array of follow sets, indexed by position
 				   index.  The follow of a position is the set
 				   of positions containing characters that
 				   could conceivably follow a character
@ -361,7 +270,7 @@ struct regexp
  int **fails;			/* Transition tables after failing to accept
 				   on a state that potentially could do so. */
  int *success;			/* Table of acceptance conditions used in
-				   regexecute and computed in build_state. */
+				   dfaexec and computed in build_state. */
  int *newlines;		/* Transitions on newlines.  The entry for a
 				   newline in any transition table is always
 				   -1 so we can count lines without wasting
@ -369,40 +278,41 @@ struct regexp
 				   newline is stored separately and handled
 				   as a special case.  Newline is also used
 				   as a sentinel at the end of the buffer. */
-  char must[MUST_MAX];
-  int mustn;
+  struct dfamust *musts;	/* List of strings, at least one of which
+				   is known to appear in any r.e. matching
+				   the dfa. */
 };

-/* Some macros for user access to regexp internals. */
+/* Some macros for user access to dfa internals. */

 /* ACCEPTING returns true if s could possibly be an accepting state of r. */
 #define ACCEPTING(s, r) ((r).states[s].constraint)

 /* ACCEPTS_IN_CONTEXT returns true if the given state accepts in the
   specified context. */
-#define ACCEPTS_IN_CONTEXT(prevn, currn, prevl, currl, state, reg) \
-  _SUCCEEDS_IN_CONTEXT((reg).states[state].constraint,		   \
+#define ACCEPTS_IN_CONTEXT(prevn, currn, prevl, currl, state, dfa) \
+  SUCCEEDS_IN_CONTEXT((dfa).states[state].constraint,		   \
 		       prevn, currn, prevl, currl)

 /* FIRST_MATCHING_REGEXP returns the index number of the first of parallel
   regexps that a given state could accept.  Parallel regexps are numbered
   starting at 1. */
-#define FIRST_MATCHING_REGEXP(state, reg) (-(reg).states[state].first_end)
+#define FIRST_MATCHING_REGEXP(state, dfa) (-(dfa).states[state].first_end)

 /* Entry points. */

 #if __STDC__

-/* Regsyntax() takes two arguments; the first sets the syntax bits described
+/* dfasyntax() takes two arguments; the first sets the syntax bits described
   earlier in this file, and the second sets the case-folding flag. */
-extern void regsyntax(int, int);
+extern void dfasyntax(int, int);

-/* Compile the given string of the given length into the given struct regexp.
+/* Compile the given string of the given length into the given struct dfa.
   Final argument is a flag specifying whether to build a searching or an
   exact matcher. */
-extern void regcompile(const char *, size_t, struct regexp *, int);
+extern void dfacomp(char *, size_t, struct dfa *, int);

-/* Execute the given struct regexp on the buffer of characters.  The
+/* Execute the given struct dfa on the buffer of characters.  The
   first char * points to the beginning, and the second points to the
   first character after the end of the buffer, which must be a writable
   place so a sentinel end-of-buffer marker can be stored there.  The
@ -414,37 +324,37 @@ extern void regcompile(const char *, size_t, struct regexp *, int);
   order to verify backreferencing; otherwise the flag will be cleared.
   Returns NULL if no match is found, or a pointer to the first
   character after the first & shortest matching string in the buffer. */
-extern char *regexecute(struct regexp *, char *, char *, int, int *, int *);
+extern char *dfaexec(struct dfa *, char *, char *, int, int *, int *);

-/* Free the storage held by the components of a struct regexp. */
-extern void regfree(struct regexp *);
+/* Free the storage held by the components of a struct dfa. */
+extern void dfafree(struct dfa *);

 /* Entry points for people who know what they're doing. */

-/* Initialize the components of a struct regexp. */
-extern void reginit(struct regexp *);
+/* Initialize the components of a struct dfa. */
+extern void dfainit(struct dfa *);

-/* Incrementally parse a string of given length into a struct regexp. */
-extern void regparse(const char *, size_t, struct regexp *);
+/* Incrementally parse a string of given length into a struct dfa. */
+extern void dfaparse(char *, size_t, struct dfa *);

 /* Analyze a parsed regexp; second argument tells whether to build a searching
   or an exact matcher. */
-extern void reganalyze(struct regexp *, int);
+extern void dfaanalyze(struct dfa *, int);

 /* Compute, for each possible character, the transitions out of a given
   state, storing them in an array of integers. */
-extern void regstate(int, struct regexp *, int []);
+extern void dfastate(int, struct dfa *, int []);

 /* Error handling. */

-/* Regerror() is called by the regexp routines whenever an error occurs.  It
+/* dfaerror() is called by the regexp routines whenever an error occurs.  It
   takes a single argument, a NUL-terminated string describing the error.
-   The default regerror() prints the error message to stderr and exits.
-   The user can provide a different regfree() if so desired. */
-extern void regerror(const char *);
+   The default dfaerror() prints the error message to stderr and exits.
+   The user can provide a different dfafree() if so desired. */
+extern void dfaerror(char *);

 #else /* ! __STDC__ */
-extern void regsyntax(), regcompile(), regfree(), reginit(), regparse();
-extern void reganalyze(), regstate(), regerror();
-extern char *regexecute();
+extern void dfasyntax(), dfacomp(), dfafree(), dfainit(), dfaparse();
+extern void dfaanalyze(), dfastate(), dfaerror();
+extern char *dfaexec();
 #endif /* ! __STDC__ */
--- a/gnu/usr.bin/grep/getopt.c
+++ b/gnu/usr.bin/grep/getopt.c
@ -3,12 +3,13 @@
   "Keep this file name-space clean" means, talk to roland@gnu.ai.mit.edu
   before changing it!

-   Copyright (C) 1987, 88, 89, 90, 91, 1992 Free Software Foundation, Inc.
+   Copyright (C) 1987, 88, 89, 90, 91, 92, 1993
+   	Free Software Foundation, Inc.

-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
+   This program is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by the
+   Free Software Foundation; either version 2, or (at your option) any
+   later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
@ -17,49 +18,67 @@

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
-   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  */
+   Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.  */

-/* AIX requires this to be the first thing in the file. */
+/* NOTE!!!  AIX requires this to be the first thing in the file.
+   Do not put ANYTHING before it!  */
+#if !defined (__GNUC__) && defined (_AIX)
+ #pragma alloca
+#endif
+
+#ifdef HAVE_CONFIG_H
+#include "config.h"
+#endif
+
 #ifdef __GNUC__
 #define alloca __builtin_alloca
 #else /* not __GNUC__ */
-#if defined(sparc) && !defined(USG) && !defined(SVR4) && !defined(__svr4__)
+#if defined (HAVE_ALLOCA_H) || (defined(sparc) && (defined(sun) || (!defined(USG) && !defined(SVR4) && !defined(__svr4__))))
 #include <alloca.h>
 #else
-#ifdef _AIX
- #pragma alloca
-#else
+#ifndef _AIX
 char *alloca ();
 #endif
-#endif /* sparc */
+#endif /* alloca.h */
 #endif /* not __GNUC__ */

-#ifdef	LIBC
-/* For when compiled as part of the GNU C library.  */
-#include <ansidecl.h>
+#if !__STDC__ && !defined(const) && IN_GCC
+#define const
+#endif
+
+/* This tells Alpha OSF/1 not to define a getopt prototype in <stdio.h>.  */
+#ifndef _NO_PROTO
+#define _NO_PROTO
 #endif

 #include <stdio.h>

+/* Comment out all this code if we are using the GNU C Library, and are not
+   actually compiling the library itself.  This code is part of the GNU C
+   Library, but also included in many other GNU distributions.  Compiling
+   and linking in this code is a waste when using the GNU C library
+   (especially if it is a shared library).  Rather than having every GNU
+   program understand `configure --with-gnu-libc' and omit the object files,
+   it is simpler to just do this in the source for each such file.  */
+
+#if defined (_LIBC) || !defined (__GNU_LIBRARY__)
+
+
 /* This needs to come after some library #include
   to get __GNU_LIBRARY__ defined.  */
 #ifdef	__GNU_LIBRARY__
 #undef	alloca
+/* Don't include stdlib.h for non-GNU C libraries because some of them
+   contain conflicting prototypes for getopt.  */
 #include <stdlib.h>
-#include <string.h>
 #else	/* Not GNU C library.  */
 #define	__alloca	alloca
 #endif	/* GNU C library.  */

-
-#ifndef __STDC__
-#define const
-#endif
-
 /* If GETOPT_COMPAT is defined, `+' as well as `--' can introduce a
   long-named option.  Because this is not POSIX.2 compliant, it is
   being phased out.  */
-#define GETOPT_COMPAT
+/* #define GETOPT_COMPAT */

 /* This version of `getopt' appears to the caller like standard Unix `getopt'
   but it behaves differently for the user, since it allows the user
@ -97,6 +116,7 @@ char *optarg = 0;
   Otherwise, `optind' communicates from one call to the next
   how much of ARGV has been scanned so far.  */

+/* XXX 1003.2 says this must be 1 before any call.  */
 int optind = 0;

 /* The next char to be scanned in the option-element
@ -113,6 +133,12 @@ static char *nextchar;

 int opterr = 1;

+/* Set to an option character which was unrecognized.
+   This must be initialized on some systems to avoid linking in the
+   system's own getopt implementation.  */
+
+int optopt = '?';
+
 /* Describe how to deal with options that follow non-option ARGV-elements.

   If the caller did not specify anything,
@ -148,6 +174,10 @@ static enum
 } ordering;

 #ifdef	__GNU_LIBRARY__
+/* We want to avoid inclusion of string.h with non-GNU libraries
+   because there are many ways it can cause trouble.
+   On some systems, it contains special magic macros that don't work
+   in GCC.  */
 #include <string.h>
 #define	my_index	strchr
 #define	my_bcopy(src, dst, n)	memcpy ((dst), (src), (n))
@ -159,22 +189,23 @@ static enum
 char *getenv ();

 static char *
-my_index (string, chr)
-     char *string;
+my_index (str, chr)
+     const char *str;
     int chr;
 {
-  while (*string)
+  while (*str)
    {
-      if (*string == chr)
-	return string;
-      string++;
+      if (*str == chr)
+	return (char *) str;
+      str++;
    }
  return 0;
 }

 static void
 my_bcopy (from, to, size)
-     char *from, *to;
+     const char *from;
+     char *to;
     int size;
 {
  int i;
@ -210,10 +241,12 @@ exchange (argv)

  /* Interchange the two blocks of data in ARGV.  */

-  my_bcopy (&argv[first_nonopt], temp, nonopts_size);
-  my_bcopy (&argv[last_nonopt], &argv[first_nonopt],
+  my_bcopy ((char *) &argv[first_nonopt], (char *) temp, nonopts_size);
+  my_bcopy ((char *) &argv[last_nonopt], (char *) &argv[first_nonopt],
 	    (optind - last_nonopt) * sizeof (char *));
-  my_bcopy (temp, &argv[first_nonopt + optind - last_nonopt], nonopts_size);
+  my_bcopy ((char *) temp,
+	    (char *) &argv[first_nonopt + optind - last_nonopt],
+	    nonopts_size);

  /* Update records for the slots the non-options now occupy.  */

@ -489,7 +522,7 @@ _getopt_internal (argc, argv, optstring, longopts, longind, long_only)
 		    fprintf (stderr, "%s: option `%s' requires an argument\n",
 			     argv[0], argv[optind - 1]);
 		  nextchar += strlen (nextchar);
-		  return '?';
+		  return optstring[0] == ':' ? ':' : '?';
 		}
 	    }
 	  nextchar += strlen (nextchar);
@ -523,7 +556,7 @@ _getopt_internal (argc, argv, optstring, longopts, longind, long_only)
 		fprintf (stderr, "%s: unrecognized option `%c%s'\n",
 			 argv[0], argv[optind][0], nextchar);
 	    }
-	  nextchar += strlen (nextchar);
+	  nextchar = (char *) "";
 	  optind++;
 	  return '?';
 	}
@ -537,18 +570,24 @@ _getopt_internal (argc, argv, optstring, longopts, longind, long_only)

    /* Increment `optind' when we start to process its last character.  */
    if (*nextchar == '\0')
-      optind++;
+      ++optind;

    if (temp == NULL || c == ':')
      {
 	if (opterr)
 	  {
+#if 0
 	    if (c < 040 || c >= 0177)
 	      fprintf (stderr, "%s: unrecognized option, character code 0%o\n",
 		       argv[0], c);
 	    else
 	      fprintf (stderr, "%s: unrecognized option `-%c'\n", argv[0], c);
+#else
+	    /* 1003.2 specifies the format of this message.  */
+	    fprintf (stderr, "%s: illegal option -- %c\n", argv[0], c);
+#endif
 	  }
+	optopt = c;
 	return '?';
      }
    if (temp[1] == ':')
@ -568,7 +607,7 @@ _getopt_internal (argc, argv, optstring, longopts, longind, long_only)
 	else
 	  {
 	    /* This is an option that requires an argument.  */
-	    if (*nextchar != 0)
+	    if (*nextchar != '\0')
 	      {
 		optarg = nextchar;
 		/* If we end this ARGV-element by taking the rest as an arg,
@ -578,8 +617,20 @@ _getopt_internal (argc, argv, optstring, longopts, longind, long_only)
 	    else if (optind == argc)
 	      {
 		if (opterr)
+		  {
+#if 0
 		    fprintf (stderr, "%s: option `-%c' requires an argument\n",
 			     argv[0], c);
+#else
+		    /* 1003.2 specifies the format of this message.  */
+		    fprintf (stderr, "%s: option requires an argument -- %c\n",
+			     argv[0], c);
+#endif
+		  }
+		optopt = c;
+		if (optstring[0] == ':')
+		  c = ':';
+		else
 		  c = '?';
 	      }
 	    else
@ -604,6 +655,8 @@ getopt (argc, argv, optstring)
 			   (int *) 0,
 			   0);
 }
+
+#endif	/* _LIBC or not __GNU_LIBRARY__.  */

 #ifdef TEST

--- a/gnu/usr.bin/grep/getopt.h
+++ b/gnu/usr.bin/grep/getopt.h
@ -1,10 +1,10 @@
 /* Declarations for getopt.
-   Copyright (C) 1989, 1990, 1991, 1992 Free Software Foundation, Inc.
+   Copyright (C) 1989, 1990, 1991, 1992, 1993 Free Software Foundation, Inc.

-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
+   This program is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by the
+   Free Software Foundation; either version 2, or (at your option) any
+   later version.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
@ -13,11 +13,15 @@

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
-   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  */
+   Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.  */

 #ifndef _GETOPT_H
 #define _GETOPT_H 1

+#ifdef	__cplusplus
+extern "C" {
+#endif
+
 /* For communication from `getopt' to the caller.
   When `getopt' finds an option that takes an argument,
   the argument value is returned here.
@ -45,6 +49,10 @@ extern int optind;

 extern int opterr;

+/* Set to an option character which was unrecognized.  */
+
+extern int optopt;
+
 /* Describe the long-named options requested by the application.
   The LONG_OPTIONS argument to getopt_long or getopt_long_only is a vector
   of `struct option' terminated by an element containing a name which is
@ -82,15 +90,19 @@ struct option

 /* Names for the values of the `has_arg' field of `struct option'.  */

-enum _argtype
-{
-  no_argument,
-  required_argument,
-  optional_argument
-};
+#define	no_argument		0
+#define required_argument	1
+#define optional_argument	2

 #if __STDC__
+#if defined(__GNU_LIBRARY__)
+/* Many other libraries have conflicting prototypes for getopt, with
+   differences in the consts, in stdlib.h.  To avoid compilation
+   errors, only prototype getopt for the GNU C library.  */
 extern int getopt (int argc, char *const *argv, const char *shortopts);
+#else /* not __GNU_LIBRARY__ */
+extern int getopt ();
+#endif /* not __GNU_LIBRARY__ */
 extern int getopt_long (int argc, char *const *argv, const char *shortopts,
 		        const struct option *longopts, int *longind);
 extern int getopt_long_only (int argc, char *const *argv,
@ -110,4 +122,8 @@ extern int getopt_long_only ();
 extern int _getopt_internal ();
 #endif /* not __STDC__ */

+#ifdef	__cplusplus
+}
+#endif
+
 #endif /* _GETOPT_H */
--- a/gnu/usr.bin/grep/getpagesize.h
+++ b/gnu/usr.bin/grep/getpagesize.h
@ -0,0 +1,42 @@
+#ifdef BSD
+#ifndef BSD4_1
+#define HAVE_GETPAGESIZE
+#endif
+#endif
+
+#ifndef HAVE_GETPAGESIZE
+
+#ifdef VMS
+#define getpagesize() 512
+#endif
+
+#ifdef HAVE_UNISTD_H
+#include <unistd.h>
+#endif
+
+#ifdef _SC_PAGESIZE
+#define getpagesize() sysconf(_SC_PAGESIZE)
+#else
+
+#ifdef HAVE_SYS_PARAM_H
+#include <sys/param.h>
+
+#ifdef EXEC_PAGESIZE
+#define getpagesize() EXEC_PAGESIZE
+#else
+#ifdef NBPG
+#define getpagesize() NBPG * CLSIZE
+#ifndef CLSIZE
+#define CLSIZE 1
+#endif /* no CLSIZE */
+#else /* no NBPG */
+#define getpagesize() NBPC
+#endif /* no NBPG */
+#endif /* no EXEC_PAGESIZE */
+#else /* !HAVE_SYS_PARAM_H */
+#define getpagesize() 8192	/* punt totally */
+#endif /* !HAVE_SYS_PARAM_H */
+#endif /* no _SC_PAGESIZE */
+
+#endif /* not HAVE_GETPAGESIZE */
+
--- a/gnu/usr.bin/grep/grep.1
+++ b/gnu/usr.bin/grep/grep.1
@ -1,234 +1,375 @@
-.TH GREP 1 "1988 December 13" "GNU Project" \" -*- nroff -*-
-.UC 4
+.TH GREP 1 "1992 September 10" "GNU Project"
 .SH NAME
-grep, egrep \- print lines matching a regular expression
-.SH SYNOPSIS
+grep, egrep, fgrep \- print lines matching a pattern
+.SH SYNOPOSIS
 .B grep
 [
-.B \-CVbchilnsvwx
-] [
-.BI \- num
-] [
-.B \-AB
-.I num
-] [ [
+.BR \- [[ AB "] ]\c"
+.I "num"
+]
+[
+.BR \- [ CEFGVBchilnsvwx ]
+]
+[
 .B \-e
 ]
-.I expr
+.I pattern
 |
-.B \-f
-.I file
+.BI \-f file
 ] [
-.I "files ..."
+.I files...
 ]
 .SH DESCRIPTION
-.I Grep
-searches the files listed in the arguments (or standard
-input if no files are given) for all lines that contain a match for
-the given
-.IR expr .
-If any lines match, they are printed.
 .PP
-Also, if any matches were found,
-.I grep
-exits with a status of 0, but if no matches were found it exits
-with a status of 1.  This is useful for building shell scripts that
-use
-.I grep
-as a condition for, for example, the
-.I if
-statement.
+.B Grep
+searches the named input
+.I files
+(or standard input if no files are named, or
+the file name
+.B \-
+is given)
+for lines containing a match to the given
+.IR pattern .
+By default,
+.B grep
+prints the matching lines.
 .PP
-When invoked as
-.I egrep
-the syntax of the
-.I expr
-is slightly different; See below.
-.br
-.SH "REGULAR EXPRESSIONS"
-.RS 2.5i
-.ta 1i 2i
-.sp
-.ti -2.0i
-(grep)	(egrep)		(explanation)
-.sp
-.ti -2.0i
-\fIc\fP	\fIc\fP	a single (non-meta) character matches itself.
-.sp
-.ti -2.0i
-\&.	.	matches any single character except newline.
-.sp
-.ti -2.0i
-\\?	?	postfix operator; preceeding item is optional.
-.sp
-.ti -2.0i
-\(**	\(**	postfix operator; preceeding item 0 or
-more times.
-.sp
-.ti -2.0i
-\\+	+	postfix operator; preceeding item 1 or
-more times.
-.sp
-.ti -2.0i
-\\|	|	infix operator; matches either
-argument.
-.sp
-.ti -2.0i
-^	^	matches the empty string at the beginning of a line.
-.sp
-.ti -2.0i
-$	$	matches the empty string at the end of a line.
-.sp
-.ti -2.0i
-\\<	\\<	matches the empty string at the beginning of a word.
-.sp
-.ti -2.0i
-\\>	\\>	matches the empty string at the end of a word.
-.sp
-.ti -2.0i
-[\fIchars\fP]	[\fIchars\fP]	match any character in the given class; if the
-first character after [ is ^, match any character
-not in the given class; a range of characters may
-be specified by \fIfirst\-last\fP; for example, \\W
-(below) is equivalent to the class [^A\-Za\-z0\-9]
-.sp
-.ti -2.0i
-\\( \\)	( )	parentheses are used to override operator precedence.
-.sp
-.ti -2.0i
-\\\fIdigit\fP	\\\fIdigit\fP	\\\fIn\fP matches a repeat of the text
-matched earlier in the regexp by the subexpression inside the nth
-opening parenthesis.
-.sp
-.ti -2.0i
-\\	\\	any special character may be preceded
-by a backslash to match it literally.
-.sp
-.ti -2.0i
-(the following are for compatibility with GNU Emacs)
-.sp
-.ti -2.0i
-\\b	\\b	matches the empty string at the edge of a word.
-.sp
-.ti -2.0i
-\\B	\\B	matches the empty string if not at the edge of a word.
-.sp
-.ti -2.0i
-\\w	\\w	matches word-constituent characters (letters & digits).
-.sp
-.ti -2.0i
-\\W	\\W	matches characters that are not word-constituent.
-.RE
-.PP
-Operator precedence is (highest to lowest) ?, \(**, and +, concatenation,
-and finally |.  All other constructs are syntactically identical to
-normal characters.  For the truly interested, the file dfa.c describes
-(and implements) the exact grammar understood by the parser.
-.SH OPTIONS
+There are three major variants of
+.BR grep ,
+controlled by the following options.
+.PD 0
 .TP
-.BI \-A " num"
-print <num> lines of context after every matching line
+.B \-G
+Interpret
+.I pattern
+as a basic regular expression (see below).  This is the default.
 .TP
-.BI \-B " num"
-print
-.I num
-lines of context before every matching line
+.B \-E
+Interpret
+.I pattern
+as an extended regular expression (see below).
 .TP
-.B \-C
-print 2 lines of context on each side of every match
+.B \-F
+Interpret
+.I pattern
+as a list of fixed strings, separated by newlines,
+any of which is to be matched.
+.LP
+In addition, two variant programs
+.B egrep
+and
+.B fgrep
+are available.
+.B Egrep
+is similiar (but not identical) to
+.BR "grep\ \-E" ,
+and is compatible with the historical Unix
+.BR egrep .
+.B Fgrep
+is the same as
+.BR "grep\ \-F" .
+.PD
+.LP
+All variants of
+.B grep
+understand the following options:
+.PD 0
 .TP
 .BI \- num
-print
+Matches will be printed with
 .I num
-lines of context on each side of every match
+lines of leading and trailing context.  However,
+.B grep
+will never print any given line more than once.
+.TP
+.BI \-A " num"
+Print
+.I num
+lines of trailing context after matching lines.
+.TP
+.BI \-B " num"
+Print
+.I num
+lines of leading context before matching lines.
+.TP
+.B \-C
+Equivalent to
+.BR \-2 .
 .TP
 .B \-V
-print the version number on the diagnostic output
+Print the version number of
+.B grep
+to standard error.  This version number should
+be included in all bug reports (see below).
 .TP
 .B \-b
-print every match preceded by its byte offset
+Print the byte offset within the input file before
+each line of output.
 .TP
 .B \-c
-print a total count of matching lines only
+Suppress normal output; instead print a count of
+matching lines for each input file.
+With the
+.B \-v
+option (see below), count non-matching lines.
 .TP
-.BI \-e " expr"
-search for
-.IR expr ;
-useful if
-.I expr
-begins with \-
+.BI \-e " pattern"
+Use
+.I pattern
+as the pattern; useful to protect patterns beginning with
+.BR \- .
 .TP
 .BI \-f " file"
-search for the expression contained in
-.I file
+Obtain the pattern from
+.IR file .
 .TP
 .B \-h
-don't display filenames on matches
+Suppress the prefixing of filenames on output
+when multiple files are searched.
 .TP
 .B \-i
-ignore case difference when comparing strings
+Ignore case distinctions in both the
+.I pattern
+and the input files.
+.TP
+.B \-L
+Suppress normal output; instead print the name
+of each input file from which no output would
+normally have been printed.
 .TP
 .B \-l
-list files containing matches only
+Suppress normal output; instead print
+the name of each input file from which output
+would normally have been printed.
 .TP
 .B \-n
-print each match preceded by its line number
+Prefix each line of output with the line number
+within its input file.
+.TP
+.B \-q
+Quiet; suppress normal output.
 .TP
 .B \-s
-run silently producing no output except error messages
+Suppress error messages about nonexistent or unreadable files.
 .TP
 .B \-v
-print only lines that contain no matches for the <expr>
+Invert the sense of matching, to select non-matching lines.
 .TP
 .B \-w
-print only lines where the match is a complete word
+Select only those lines containing matches that form whole words.
+The test is that the matching substring must either be at the
+beginning of the line, or preceded by a non-word constituent
+character.  Similarly, it must be either at the end of the line
+or followed by a non-word constituent character.  Word-constituent
+characters are letters, digits, and the underscore.
 .TP
 .B \-x
-print only lines where the match is a whole line
-.SH "SEE ALSO"
-emacs(1), ed(1), sh(1),
-.I "GNU Emacs Manual"
-.SH INCOMPATIBILITIES
-The following incompatibilities with UNIX
-.I grep
-exist:
+Select only those matches that exactly match the whole line.
+.PD
+.SH "REGULAR EXPRESSIONS"
 .PP
-.RS 0.5i
-The context-dependent meaning of \(** is not quite the same (grep only).
+A regular expression is a pattern that describes a set of strings.
+Regular expressions are constructed analagously to arithmetic
+expressions, by using various operators to combine smaller expressions.
 .PP
-.B \-b
-prints a byte offset instead of a block offset.
+.B Grep
+understands two different versions of regular expression syntax:
+``basic'' and ``extended.''  In
+.RB "GNU\ " grep ,
+there is no difference in available functionality using either syntax.
+In other implementations, basic regular expressions are less powerful.
+The following description applies to extended regular expressions;
+differences for basic regular expressions are summarized afterwards.
 .PP
-The {\fIm,n\fP} construct of System V grep is not implemented.
+The fundamental building blocks are the regular expressions that match
+a single character.  Most characters, including all letters and digits,
+are regular expressions that match themselves.  Any metacharacter with
+special meaning may be quoted by preceding it with a backslash.
 .PP
+A list of characters enclosed by
+.B [
+and
+.B ]
+matches any single
+character in that list; if the first character of the list
+is the caret
+.B ^
+then it matches any character
+.I not
+in the list.
+For example, the regular expression
+.B [0123456789]
+matches any single digit.  A range of ASCII characters
+may be specified by giving the first and last characters, separated
+by a hyphen.
+Finally, certain named classes of characters are predefined.
+Their names are self explanatory, and they are
+.BR [:alnum:] ,
+.BR [:alpha:] ,
+.BR [:cntrl:] ,
+.BR [:digit:] ,
+.BR [:graph:] ,
+.BR [:lower:] ,
+.BR [:print:] ,
+.BR [:punct:] ,
+.BR [:space:] ,
+.BR [:upper:] ,
+and
+.BR [:xdigit:].
+For example, 
+.B [[:alnum:]]
+means
+.BR [0-9A-Za-z] ,
+except the latter form is dependent upon the ASCII character encoding,
+whereas the former is portable.
+(Note that the brackets in these class names are part of the symbolic
+names, and must be included in addition to the brackets delimiting
+the bracket list.)  Most metacharacters lose their special meaning
+inside lists.  To include a literal
+.B ]
+place it first in the list.  Similarly, to include a literal
+.B ^
+place it anywhere but first.  Finally, to include a literal
+.B \-
+place it last.
+.PP
+The period
+.B .
+matches any single character.
+The symbol
+.B \ew
+is a synonym for
+.B [[:alnum:]]
+and
+.B \eW
+is a synonym for
+.BR [^[:alnum]] .
+.PP
+The caret
+.B ^
+and the dollar sign
+.B $
+are metacharacters that respectively match the empty string at the
+beginning and end of a line.
+The symbols
+.B \e<
+and
+.B \e>
+respectively match the empty string at the beginning and end of a word.
+The symbol
+.B \eb
+matches the empty string at the edge of a word,
+and
+.B \eB
+matches the empty string provided it's
+.I not
+at the edge of a word.
+.PP
+A regular expression matching a single character may be followed
+by one of several repetition operators:
+.PD 0
+.TP
+.B ?
+The preceding item is optional and matched at most once.
+.TP
+.B *
+The preceding item will be matched zero or more times.
+.TP
+.B +
+The preceding item will be matched one or more times.
+.TP
+.BI { n }
+The preceding item is matched exactly
+.I n
+times.
+.TP
+.BI { n ,}
+The preceding item is matched
+.I n
+or more times.
+.TP
+.BI {, m }
+The preceding item is optional and is matched at most
+.I m
+times.
+.TP
+.BI { n , m }
+The preceding item is matched at least
+.I n
+times, but not more than
+.I m
+times.
+.PD
+.PP
+Two regular expressions may be concatenated; the resulting
+regular expression matches any string formed by concatenating
+two substrings that respectively match the concatenated
+subexpressions.
+.PP
+Two regular expressions may be joined by the infix operator
+.BR | ;
+the resulting regular expression matches any string matching
+either subexpression.
+.PP
+Repetition takes precedence over concatenation, which in turn
+takes precedence over alternation.  A whole subexpression may be
+enclosed in parentheses to override these precedence rules.
+.PP
+The backreference
+.BI \e n\c
+\&, where
+.I n
+is a single digit, matches the substring
+previously matched by the
+.IR n th
+parenthesized subexpression of the regular expression.
+.PP
+In basic regular expressions the metacharacters
+.BR ? ,
+.BR + ,
+.BR { ,
+.BR | ,
+.BR ( ,
+and
+.BR )
+lose their special meaning; instead use the backslashed
+versions
+.BR \e? ,
+.BR \e+ ,
+.BR \e{ ,
+.BR \e| ,
+.BR \e( ,
+and
+.BR \e) .
+.PP
+In
+.B egrep
+the metacharacter
+.B {
+loses its special meaning; instead use
+.BR \e{ .
+.SH DIAGNOSTICS
+.PP
+Normally, exit status is 0 if matches were found,
+and 1 if no matches were found.  (The
+.B \-v
+option inverts the sense of the exit status.)
+Exit status is 2 if there were syntax errors
+in the pattern, inaccessible input files, or
+other system errors.
 .SH BUGS
-GNU \fIe?grep\fP has been thoroughly debugged and tested over a period
-of several years; we think it's a reliable beast or we wouldn't
-distribute it.  If by some fluke of the universe you discover a bug,
-send a detailed description (including options, regular expressions,
-and a copy of an input file that can reproduce it) to mike@ai.mit.edu.
 .PP
-.SH AUTHORS
-Mike Haertel wrote the deterministic regexp code and the bulk
-of the program.
+Email bug reports to
+.BR bug-gnu-utils@prep.ai.mit.edu .
+Be sure to include the word ``grep'' somewhere in the ``Subject:'' field.
 .PP
-James A. Woods is responsible for the hybridized search strategy
-of using Boyer-Moore-Gosper fixed-string search as a filter
-before calling the general regexp matcher.
+Large repetition counts in the
+.BI { m , n }
+construct may cause grep to use lots of memory.
+In addition,
+certain other obscure regular expressions require exponential time
+and space, and may cause
+.B grep
+to run out of memory.
 .PP
-Arthur David Olson contributed code that finds fixed strings for
-the aforementioned BMG search for a large class of regexps.
-.PP
-Richard Stallman wrote the backtracking regexp matcher that is used
-for \\\fIdigit\fP backreferences, as well as the GNU getopt.  The
-backtracking matcher was originally written for GNU Emacs.
-.PP
-D. A. Gwyn wrote the C alloca emulation that is provided so
-System V machines can run this program.  (Alloca is used only
-by RMS' backtracking matcher, and then only rarely, so there
-is no loss if your machine doesn't have a "real" alloca.)
-.PP
-Scott Anderson and Henry Spencer designed the regression tests
-used in the "regress" script.
-.PP
-Paul Placeway wrote the original version of this manual page.
+Backreferences are very slow, and may require exponential time.
--- a/gnu/usr.bin/grep/grep.c
+++ b/gnu/usr.bin/grep/grep.c
--- a/gnu/usr.bin/grep/grep.h
+++ b/gnu/usr.bin/grep/grep.h
@ -0,0 +1,53 @@
+/* grep.h - interface to grep driver for searching subroutines.
+   Copyright (C) 1992 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. */
+
+#if __STDC__
+
+extern void fatal(const char *, int);
+
+/* Grep.c expects the matchers vector to be terminated
+   by an entry with a NULL name, and to contain at least
+   an entry named "default". */
+
+extern struct matcher
+{
+  char *name;
+  void (*compile)(char *, size_t);
+  char *(*execute)(char *, size_t, char **);
+} matchers[];
+
+#else
+
+extern void fatal();
+
+extern struct matcher
+{
+  char *name;
+  void (*compile)();
+  char *(*execute)();
+} matchers[];
+
+#endif
+
+/* Exported from grep.c. */
+extern char *matcher;
+
+/* The following flags are exported from grep for the matchers
+   to look at. */
+extern int match_icase;		/* -i */
+extern int match_words;		/* -w */
+extern int match_lines;		/* -x */
--- a/gnu/usr.bin/grep/kwset.c
+++ b/gnu/usr.bin/grep/kwset.c
@ -0,0 +1,805 @@
+/* kwset.c - search for any of a set of keywords.
+   Copyright 1989 Free Software Foundation
+		  Written August 1989 by Mike Haertel.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 1, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+
+   The author may be reached (Email) at the address mike@ai.mit.edu,
+   or (US mail) as Mike Haertel c/o Free Software Foundation. */
+
+/* The algorithm implemented by these routines bears a startling resemblence
+   to one discovered by Beate Commentz-Walter, although it is not identical.
+   See "A String Matching Algorithm Fast on the Average," Technical Report,
+   IBM-Germany, Scientific Center Heidelberg, Tiergartenstrasse 15, D-6900
+   Heidelberg, Germany.  See also Aho, A.V., and M. Corasick, "Efficient
+   String Matching:  An Aid to Bibliographic Search," CACM June 1975,
+   Vol. 18, No. 6, which describes the failure function used below. */
+
+
+#ifdef STDC_HEADERS
+#include <limits.h>
+#include <stdlib.h>
+#else
+#define INT_MAX 2147483647
+#define UCHAR_MAX 255
+#ifdef __STDC__
+#include <stddef.h>
+#else
+#include <sys/types.h>
+#endif
+extern char *malloc();
+extern void free();
+#endif
+
+#ifdef HAVE_MEMCHR
+#include <string.h>
+#ifdef NEED_MEMORY_H
+#include <memory.h>
+#endif
+#else
+#ifdef __STDC__
+extern void *memchr();
+#else
+extern char *memchr();
+#endif
+#endif
+
+#ifdef GREP
+extern char *xmalloc();
+#define malloc xmalloc
+#endif
+
+#include "kwset.h"
+#include "obstack.h"
+
+#define NCHAR (UCHAR_MAX + 1)
+#define obstack_chunk_alloc malloc
+#define obstack_chunk_free free
+
+/* Balanced tree of edges and labels leaving a given trie node. */
+struct tree
+{
+  struct tree *llink;		/* Left link; MUST be first field. */
+  struct tree *rlink;		/* Right link (to larger labels). */
+  struct trie *trie;		/* Trie node pointed to by this edge. */
+  unsigned char label;		/* Label on this edge. */
+  char balance;			/* Difference in depths of subtrees. */
+};
+
+/* Node of a trie representing a set of reversed keywords. */
+struct trie
+{
+  unsigned int accepting;	/* Word index of accepted word, or zero. */
+  struct tree *links;		/* Tree of edges leaving this node. */
+  struct trie *parent;		/* Parent of this node. */
+  struct trie *next;		/* List of all trie nodes in level order. */
+  struct trie *fail;		/* Aho-Corasick failure function. */
+  int depth;			/* Depth of this node from the root. */
+  int shift;			/* Shift function for search failures. */
+  int maxshift;			/* Max shift of self and descendents. */
+};
+
+/* Structure returned opaquely to the caller, containing everything. */
+struct kwset
+{
+  struct obstack obstack;	/* Obstack for node allocation. */
+  int words;			/* Number of words in the trie. */
+  struct trie *trie;		/* The trie itself. */
+  int mind;			/* Minimum depth of an accepting node. */
+  int maxd;			/* Maximum depth of any node. */
+  unsigned char delta[NCHAR];	/* Delta table for rapid search. */
+  struct trie *next[NCHAR];	/* Table of children of the root. */
+  char *target;			/* Target string if there's only one. */
+  int mind2;			/* Used in Boyer-Moore search for one string. */
+  char *trans;			/* Character translation table. */
+};
+
+/* Allocate and initialize a keyword set object, returning an opaque
+   pointer to it.  Return NULL if memory is not available. */
+kwset_t
+kwsalloc(trans)
+     char *trans;
+{
+  struct kwset *kwset;
+
+  kwset = (struct kwset *) malloc(sizeof (struct kwset));
+  if (!kwset)
+    return 0;
+
+  obstack_init(&kwset->obstack);
+  kwset->words = 0;
+  kwset->trie
+    = (struct trie *) obstack_alloc(&kwset->obstack, sizeof (struct trie));
+  if (!kwset->trie)
+    {
+      kwsfree((kwset_t) kwset);
+      return 0;
+    }
+  kwset->trie->accepting = 0;
+  kwset->trie->links = 0;
+  kwset->trie->parent = 0;
+  kwset->trie->next = 0;
+  kwset->trie->fail = 0;
+  kwset->trie->depth = 0;
+  kwset->trie->shift = 0;
+  kwset->mind = INT_MAX;
+  kwset->maxd = -1;
+  kwset->target = 0;
+  kwset->trans = trans;
+
+  return (kwset_t) kwset;
+}
+
+/* Add the given string to the contents of the keyword set.  Return NULL
+   for success, an error message otherwise. */
+char *
+kwsincr(kws, text, len)
+     kwset_t kws;
+     char *text;
+     size_t len;
+{
+  struct kwset *kwset;
+  register struct trie *trie;
+  register unsigned char label;
+  register struct tree *link;
+  register int depth;
+  struct tree *links[12];
+  enum { L, R } dirs[12];
+  struct tree *t, *r, *l, *rl, *lr;
+
+  kwset = (struct kwset *) kws;
+  trie = kwset->trie;
+  text += len;
+
+  /* Descend the trie (built of reversed keywords) character-by-character,
+     installing new nodes when necessary. */
+  while (len--)
+    {
+      label = kwset->trans ? kwset->trans[(unsigned char) *--text] : *--text;
+
+      /* Descend the tree of outgoing links for this trie node,
+	 looking for the current character and keeping track
+	 of the path followed. */
+      link = trie->links;
+      links[0] = (struct tree *) &trie->links;
+      dirs[0] = L;
+      depth = 1;
+
+      while (link && label != link->label)
+	{
+	  links[depth] = link;
+	  if (label < link->label)
+	    dirs[depth++] = L, link = link->llink;
+	  else
+	    dirs[depth++] = R, link = link->rlink;
+	}
+
+      /* The current character doesn't have an outgoing link at
+	 this trie node, so build a new trie node and install
+	 a link in the current trie node's tree. */
+      if (!link)
+	{
+	  link = (struct tree *) obstack_alloc(&kwset->obstack,
+					       sizeof (struct tree));
+	  if (!link)
+	    return "memory exhausted";
+	  link->llink = 0;
+	  link->rlink = 0;
+	  link->trie = (struct trie *) obstack_alloc(&kwset->obstack,
+						     sizeof (struct trie));
+	  if (!link->trie)
+	    return "memory exhausted";
+	  link->trie->accepting = 0;
+	  link->trie->links = 0;
+	  link->trie->parent = trie;
+	  link->trie->next = 0;
+	  link->trie->fail = 0;
+	  link->trie->depth = trie->depth + 1;
+	  link->trie->shift = 0;
+	  link->label = label;
+	  link->balance = 0;
+
+	  /* Install the new tree node in its parent. */
+	  if (dirs[--depth] == L)
+	    links[depth]->llink = link;
+	  else
+	    links[depth]->rlink = link;
+
+	  /* Back up the tree fixing the balance flags. */
+	  while (depth && !links[depth]->balance)
+	    {
+	      if (dirs[depth] == L)
+		--links[depth]->balance;
+	      else
+		++links[depth]->balance;
+	      --depth;
+	    }
+
+	  /* Rebalance the tree by pointer rotations if necessary. */
+	  if (depth && ((dirs[depth] == L && --links[depth]->balance)
+			|| (dirs[depth] == R && ++links[depth]->balance)))
+	    {
+	      switch (links[depth]->balance)
+		{
+		case (char) -2:
+		  switch (dirs[depth + 1])
+		    {
+		    case L:
+		      r = links[depth], t = r->llink, rl = t->rlink;
+		      t->rlink = r, r->llink = rl;
+		      t->balance = r->balance = 0;
+		      break;
+		    case R:
+		      r = links[depth], l = r->llink, t = l->rlink;
+		      rl = t->rlink, lr = t->llink;
+		      t->llink = l, l->rlink = lr, t->rlink = r, r->llink = rl;
+		      l->balance = t->balance != 1 ? 0 : -1;
+		      r->balance = t->balance != (char) -1 ? 0 : 1;
+		      t->balance = 0;
+		      break;
+		    }
+		  break;
+		case 2:
+		  switch (dirs[depth + 1])
+		    {
+		    case R:
+		      l = links[depth], t = l->rlink, lr = t->llink;
+		      t->llink = l, l->rlink = lr;
+		      t->balance = l->balance = 0;
+		      break;
+		    case L:
+		      l = links[depth], r = l->rlink, t = r->llink;
+		      lr = t->llink, rl = t->rlink;
+		      t->llink = l, l->rlink = lr, t->rlink = r, r->llink = rl;
+		      l->balance = t->balance != 1 ? 0 : -1;
+		      r->balance = t->balance != (char) -1 ? 0 : 1;
+		      t->balance = 0;
+		      break;
+		    }
+		  break;
+		}
+
+	      if (dirs[depth - 1] == L)
+		links[depth - 1]->llink = t;
+	      else
+		links[depth - 1]->rlink = t;
+	    }
+	}
+
+      trie = link->trie;
+    }
+
+  /* Mark the node we finally reached as accepting, encoding the
+     index number of this word in the keyword set so far. */
+  if (!trie->accepting)
+    trie->accepting = 1 + 2 * kwset->words;
+  ++kwset->words;
+
+  /* Keep track of the longest and shortest string of the keyword set. */
+  if (trie->depth < kwset->mind)
+    kwset->mind = trie->depth;
+  if (trie->depth > kwset->maxd)
+    kwset->maxd = trie->depth;
+
+  return 0;
+}
+
+/* Enqueue the trie nodes referenced from the given tree in the
+   given queue. */
+static void
+enqueue(tree, last)
+     struct tree *tree;
+     struct trie **last;
+{
+  if (!tree)
+    return;
+  enqueue(tree->llink, last);
+  enqueue(tree->rlink, last);
+  (*last) = (*last)->next = tree->trie;
+}
+
+/* Compute the Aho-Corasick failure function for the trie nodes referenced
+   from the given tree, given the failure function for their parent as
+   well as a last resort failure node. */
+static void
+treefails(tree, fail, recourse)
+     register struct tree *tree;
+     struct trie *fail;
+     struct trie *recourse;
+{
+  register struct tree *link;
+
+  if (!tree)
+    return;
+
+  treefails(tree->llink, fail, recourse);
+  treefails(tree->rlink, fail, recourse);
+
+  /* Find, in the chain of fails going back to the root, the first
+     node that has a descendent on the current label. */
+  while (fail)
+    {
+      link = fail->links;
+      while (link && tree->label != link->label)
+	if (tree->label < link->label)
+	  link = link->llink;
+	else
+	  link = link->rlink;
+      if (link)
+	{
+	  tree->trie->fail = link->trie;
+	  return;
+	}
+      fail = fail->fail;
+    }
+
+  tree->trie->fail = recourse;
+}
+
+/* Set delta entries for the links of the given tree such that
+   the preexisting delta value is larger than the current depth. */
+static void
+treedelta(tree, depth, delta)
+     register struct tree *tree;
+     register unsigned int depth;
+     unsigned char delta[];
+{
+  if (!tree)
+    return;
+  treedelta(tree->llink, depth, delta);
+  treedelta(tree->rlink, depth, delta);
+  if (depth < delta[tree->label])
+    delta[tree->label] = depth;
+}
+
+/* Return true if A has every label in B. */
+static int
+hasevery(a, b)
+     register struct tree *a;
+     register struct tree *b;
+{
+  if (!b)
+    return 1;
+  if (!hasevery(a, b->llink))
+    return 0;
+  if (!hasevery(a, b->rlink))
+    return 0;
+  while (a && b->label != a->label)
+    if (b->label < a->label)
+      a = a->llink;
+    else
+      a = a->rlink;
+  return !!a;
+}
+
+/* Compute a vector, indexed by character code, of the trie nodes
+   referenced from the given tree. */
+static void
+treenext(tree, next)
+     struct tree *tree;
+     struct trie *next[];
+{
+  if (!tree)
+    return;
+  treenext(tree->llink, next);
+  treenext(tree->rlink, next);
+  next[tree->label] = tree->trie;
+}
+
+/* Compute the shift for each trie node, as well as the delta
+   table and next cache for the given keyword set. */
+char *
+kwsprep(kws)
+     kwset_t kws;
+{
+  register struct kwset *kwset;
+  register int i;
+  register struct trie *curr, *fail;
+  register char *trans;
+  unsigned char delta[NCHAR];
+  struct trie *last, *next[NCHAR];
+
+  kwset = (struct kwset *) kws;
+
+  /* Initial values for the delta table; will be changed later.  The
+     delta entry for a given character is the smallest depth of any
+     node at which an outgoing edge is labeled by that character. */
+  if (kwset->mind < 256)
+    for (i = 0; i < NCHAR; ++i)
+      delta[i] = kwset->mind;
+  else
+    for (i = 0; i < NCHAR; ++i)
+      delta[i] = 255;
+
+  /* Check if we can use the simple boyer-moore algorithm, instead
+     of the hairy commentz-walter algorithm. */
+  if (kwset->words == 1 && kwset->trans == 0)
+    {
+      /* Looking for just one string.  Extract it from the trie. */
+      kwset->target = obstack_alloc(&kwset->obstack, kwset->mind);
+      for (i = kwset->mind - 1, curr = kwset->trie; i >= 0; --i)
+	{
+	  kwset->target[i] = curr->links->label;
+	  curr = curr->links->trie;
+	}
+      /* Build the Boyer Moore delta.  Boy that's easy compared to CW. */
+      for (i = 0; i < kwset->mind; ++i)
+	delta[(unsigned char) kwset->target[i]] = kwset->mind - (i + 1);
+      kwset->mind2 = kwset->mind;
+      /* Find the minimal delta2 shift that we might make after
+	 a backwards match has failed. */
+      for (i = 0; i < kwset->mind - 1; ++i)
+	if (kwset->target[i] == kwset->target[kwset->mind - 1])
+	  kwset->mind2 = kwset->mind - (i + 1);
+    }
+  else
+    {
+      /* Traverse the nodes of the trie in level order, simultaneously
+	 computing the delta table, failure function, and shift function. */
+      for (curr = last = kwset->trie; curr; curr = curr->next)
+	{
+	  /* Enqueue the immediate descendents in the level order queue. */
+	  enqueue(curr->links, &last);
+
+	  curr->shift = kwset->mind;
+	  curr->maxshift = kwset->mind;
+
+	  /* Update the delta table for the descendents of this node. */
+	  treedelta(curr->links, curr->depth, delta);
+
+	  /* Compute the failure function for the decendents of this node. */
+	  treefails(curr->links, curr->fail, kwset->trie);
+
+	  /* Update the shifts at each node in the current node's chain
+	     of fails back to the root. */
+	  for (fail = curr->fail; fail; fail = fail->fail)
+	    {
+	      /* If the current node has some outgoing edge that the fail
+		 doesn't, then the shift at the fail should be no larger
+		 than the difference of their depths. */
+	      if (!hasevery(fail->links, curr->links))
+		if (curr->depth - fail->depth < fail->shift)
+		  fail->shift = curr->depth - fail->depth;
+
+	      /* If the current node is accepting then the shift at the
+		 fail and its descendents should be no larger than the
+		 difference of their depths. */
+	      if (curr->accepting && fail->maxshift > curr->depth - fail->depth)
+		fail->maxshift = curr->depth - fail->depth;
+	    }
+	}
+
+      /* Traverse the trie in level order again, fixing up all nodes whose
+	 shift exceeds their inherited maxshift. */
+      for (curr = kwset->trie->next; curr; curr = curr->next)
+	{
+	  if (curr->maxshift > curr->parent->maxshift)
+	    curr->maxshift = curr->parent->maxshift;
+	  if (curr->shift > curr->maxshift)
+	    curr->shift = curr->maxshift;
+	}
+
+      /* Create a vector, indexed by character code, of the outgoing links
+	 from the root node. */
+      for (i = 0; i < NCHAR; ++i)
+	next[i] = 0;
+      treenext(kwset->trie->links, next);
+
+      if ((trans = kwset->trans) != 0)
+	for (i = 0; i < NCHAR; ++i)
+	  kwset->next[i] = next[(unsigned char) trans[i]];
+      else
+	for (i = 0; i < NCHAR; ++i)
+	  kwset->next[i] = next[i];
+    }
+
+  /* Fix things up for any translation table. */
+  if ((trans = kwset->trans) != 0)
+    for (i = 0; i < NCHAR; ++i)
+      kwset->delta[i] = delta[(unsigned char) trans[i]];
+  else
+    for (i = 0; i < NCHAR; ++i)
+      kwset->delta[i] = delta[i];
+
+  return 0;
+}
+
+#define U(C) ((unsigned char) (C))
+
+/* Fast boyer-moore search. */
+static char *
+bmexec(kws, text, size)
+     kwset_t kws;
+     char *text;
+     size_t size;
+{
+  struct kwset *kwset;
+  register unsigned char *d1;
+  register char *ep, *sp, *tp;
+  register int d, gc, i, len, md2;
+
+  kwset = (struct kwset *) kws;
+  len = kwset->mind;
+
+  if (len == 0)
+    return text;
+  if (len > size)
+    return 0;
+  if (len == 1)
+    return memchr(text, kwset->target[0], size);
+
+  d1 = kwset->delta;
+  sp = kwset->target + len;
+  gc = U(sp[-2]);
+  md2 = kwset->mind2;
+  tp = text + len;
+
+  /* Significance of 12: 1 (initial offset) + 10 (skip loop) + 1 (md2). */
+  if (size > 12 * len)
+    /* 11 is not a bug, the initial offset happens only once. */
+    for (ep = text + size - 11 * len;;)
+      {
+	while (tp <= ep)
+	  {
+	    d = d1[U(tp[-1])], tp += d;
+	    d = d1[U(tp[-1])], tp += d;
+	    if (d == 0)
+	      goto found;
+	    d = d1[U(tp[-1])], tp += d;
+	    d = d1[U(tp[-1])], tp += d;
+	    d = d1[U(tp[-1])], tp += d;
+	    if (d == 0)
+	      goto found;
+	    d = d1[U(tp[-1])], tp += d;
+	    d = d1[U(tp[-1])], tp += d;
+	    d = d1[U(tp[-1])], tp += d;
+	    if (d == 0)
+	      goto found;
+	    d = d1[U(tp[-1])], tp += d;
+	    d = d1[U(tp[-1])], tp += d;
+	  }
+	break;
+      found:
+	if (U(tp[-2]) == gc)
+	  {
+	    for (i = 3; i <= len && U(tp[-i]) == U(sp[-i]); ++i)
+	      ;
+	    if (i > len)
+	      return tp - len;
+	  }
+	tp += md2;
+      }
+
+  /* Now we have only a few characters left to search.  We
+     carefully avoid ever producing an out-of-bounds pointer. */
+  ep = text + size;
+  d = d1[U(tp[-1])];
+  while (d <= ep - tp)
+    {
+      d = d1[U((tp += d)[-1])];
+      if (d != 0)
+	continue;
+      if (tp[-2] == gc)
+	{
+	  for (i = 3; i <= len && U(tp[-i]) == U(sp[-i]); ++i)
+	    ;
+	  if (i > len)
+	    return tp - len;
+	}
+      d = md2;
+    }
+
+  return 0;
+}
+
+/* Hairy multiple string search. */
+static char *
+cwexec(kws, text, len, kwsmatch)
+     kwset_t kws;
+     char *text;
+     size_t len;
+     struct kwsmatch *kwsmatch;
+{
+  struct kwset *kwset;
+  struct trie **next, *trie, *accept;
+  char *beg, *lim, *mch, *lmch;
+  register unsigned char c, *delta;
+  register int d;
+  register char *end, *qlim;
+  register struct tree *tree;
+  register char *trans;
+
+  /* Initialize register copies and look for easy ways out. */
+  kwset = (struct kwset *) kws;
+  if (len < kwset->mind)
+    return 0;
+  next = kwset->next;
+  delta = kwset->delta;
+  trans = kwset->trans;
+  lim = text + len;
+  end = text;
+  if ((d = kwset->mind) != 0)
+    mch = 0;
+  else
+    {
+      mch = text, accept = kwset->trie;
+      goto match;
+    }
+
+  if (len >= 4 * kwset->mind)
+    qlim = lim - 4 * kwset->mind;
+  else
+    qlim = 0;
+
+  while (lim - end >= d)
+    {
+      if (qlim && end <= qlim)
+	{
+	  end += d - 1;
+	  while ((d = delta[c = *end]) && end < qlim)
+	    {
+	      end += d;
+	      end += delta[(unsigned char) *end];
+	      end += delta[(unsigned char) *end];
+	    }
+	  ++end;
+	}
+      else
+	d = delta[c = (end += d)[-1]];
+      if (d)
+	continue;
+      beg = end - 1;
+      trie = next[c];
+      if (trie->accepting)
+	{
+	  mch = beg;
+	  accept = trie;
+	}
+      d = trie->shift;
+      while (beg > text)
+	{
+	  c = trans ? trans[(unsigned char) *--beg] : *--beg;
+	  tree = trie->links;
+	  while (tree && c != tree->label)
+	    if (c < tree->label)
+	      tree = tree->llink;
+	    else
+	      tree = tree->rlink;
+	  if (tree)
+	    {
+	      trie = tree->trie;
+	      if (trie->accepting)
+		{
+		  mch = beg;
+		  accept = trie;
+		}
+	    }
+	  else
+	    break;
+	  d = trie->shift;
+	}
+      if (mch)
+	goto match;
+    }
+  return 0;
+
+ match:
+  /* Given a known match, find the longest possible match anchored
+     at or before its starting point.  This is nearly a verbatim
+     copy of the preceding main search loops. */
+  if (lim - mch > kwset->maxd)
+    lim = mch + kwset->maxd;
+  lmch = 0;
+  d = 1;
+  while (lim - end >= d)
+    {
+      if ((d = delta[c = (end += d)[-1]]) != 0)
+	continue;
+      beg = end - 1;
+      if (!(trie = next[c]))
+	{
+	  d = 1;
+	  continue;
+	}
+      if (trie->accepting && beg <= mch)
+	{
+	  lmch = beg;
+	  accept = trie;
+	}
+      d = trie->shift;
+      while (beg > text)
+	{
+	  c = trans ? trans[(unsigned char) *--beg] : *--beg;
+	  tree = trie->links;
+	  while (tree && c != tree->label)
+	    if (c < tree->label)
+	      tree = tree->llink;
+	    else
+	      tree = tree->rlink;
+	  if (tree)
+	    {
+	      trie = tree->trie;
+	      if (trie->accepting && beg <= mch)
+		{
+		  lmch = beg;
+		  accept = trie;
+		}
+	    }
+	  else
+	    break;
+	  d = trie->shift;
+	}
+      if (lmch)
+	{
+	  mch = lmch;
+	  goto match;
+	}
+      if (!d)
+	d = 1;
+    }
+
+  if (kwsmatch)
+    {
+      kwsmatch->index = accept->accepting / 2;
+      kwsmatch->beg[0] = mch;
+      kwsmatch->size[0] = accept->depth;
+    }
+  return mch;
+}
+  
+/* Search through the given text for a match of any member of the
+   given keyword set.  Return a pointer to the first character of
+   the matching substring, or NULL if no match is found.  If FOUNDLEN
+   is non-NULL store in the referenced location the length of the
+   matching substring.  Similarly, if FOUNDIDX is non-NULL, store
+   in the referenced location the index number of the particular
+   keyword matched. */
+char *
+kwsexec(kws, text, size, kwsmatch)
+     kwset_t kws;
+     char *text;
+     size_t size;
+     struct kwsmatch *kwsmatch;
+{
+  struct kwset *kwset;
+  char *ret;
+
+  kwset = (struct kwset *) kws;
+  if (kwset->words == 1 && kwset->trans == 0)
+    {
+      ret = bmexec(kws, text, size);
+      if (kwsmatch != 0 && ret != 0)
+	{
+	  kwsmatch->index = 0;
+	  kwsmatch->beg[0] = ret;
+	  kwsmatch->size[0] = kwset->mind;
+	}
+      return ret;
+    }
+  else
+    return cwexec(kws, text, size, kwsmatch);
+}
+
+/* Free the components of the given keyword set. */
+void
+kwsfree(kws)
+     kwset_t kws;
+{
+  struct kwset *kwset;
+
+  kwset = (struct kwset *) kws;
+  obstack_free(&kwset->obstack, 0);
+  free(kws);
+}
--- a/gnu/usr.bin/grep/kwset.h
+++ b/gnu/usr.bin/grep/kwset.h
@ -0,0 +1,69 @@
+/* kwset.h - header declaring the keyword set library.
+   Copyright 1989 Free Software Foundation
+		  Written August 1989 by Mike Haertel.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 1, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+
+   The author may be reached (Email) at the address mike@ai.mit.edu,
+   or (US mail) as Mike Haertel c/o Free Software Foundation. */
+
+struct kwsmatch
+{
+  int index;			/* Index number of matching keyword. */
+  char *beg[1];			/* Begin pointer for each submatch. */
+  size_t size[1];		/* Length of each submatch. */
+};
+
+#if __STDC__
+
+typedef void *kwset_t;
+
+/* Return an opaque pointer to a newly allocated keyword set, or NULL
+   if enough memory cannot be obtained.  The argument if non-NULL
+   specifies a table of character translations to be applied to all
+   pattern and search text. */
+extern kwset_t kwsalloc(char *);
+
+/* Incrementally extend the keyword set to include the given string.
+   Return NULL for success, or an error message.  Remember an index
+   number for each keyword included in the set. */
+extern char *kwsincr(kwset_t, char *, size_t);
+
+/* When the keyword set has been completely built, prepare it for
+   use.  Return NULL for success, or an error message. */
+extern char *kwsprep(kwset_t);
+
+/* Search through the given buffer for a member of the keyword set.
+   Return a pointer to the leftmost longest match found, or NULL if
+   no match is found.  If foundlen is non-NULL, store the length of
+   the matching substring in the integer it points to.  Similarly,
+   if foundindex is non-NULL, store the index of the particular
+   keyword found therein. */
+extern char *kwsexec(kwset_t, char *, size_t, struct kwsmatch *);
+
+/* Deallocate the given keyword set and all its associated storage. */
+extern void kwsfree(kwset_t);
+
+#else
+
+typedef char *kwset_t;
+
+extern kwset_t kwsalloc();
+extern char *kwsincr();
+extern char *kwsprep();
+extern char *kwsexec();
+extern void kwsfree();
+
+#endif
--- a/gnu/usr.bin/grep/obstack.c
+++ b/gnu/usr.bin/grep/obstack.c
@ -0,0 +1,454 @@
+/* obstack.c - subroutines used implicitly by object stack macros
+   Copyright (C) 1988, 1993 Free Software Foundation, Inc.
+
+This program is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 2, or (at your option) any
+later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.  */
+
+#include "obstack.h"
+
+/* This is just to get __GNU_LIBRARY__ defined.  */
+#include <stdio.h>
+
+/* Comment out all this code if we are using the GNU C Library, and are not
+   actually compiling the library itself.  This code is part of the GNU C
+   Library, but also included in many other GNU distributions.  Compiling
+   and linking in this code is a waste when using the GNU C library
+   (especially if it is a shared library).  Rather than having every GNU
+   program understand `configure --with-gnu-libc' and omit the object files,
+   it is simpler to just do this in the source for each such file.  */
+
+#if defined (_LIBC) || !defined (__GNU_LIBRARY__)
+
+
+#ifdef __STDC__
+#define POINTER void *
+#else
+#define POINTER char *
+#endif
+
+/* Determine default alignment.  */
+struct fooalign {char x; double d;};
+#define DEFAULT_ALIGNMENT  \
+  ((PTR_INT_TYPE) ((char *)&((struct fooalign *) 0)->d - (char *)0))
+/* If malloc were really smart, it would round addresses to DEFAULT_ALIGNMENT.
+   But in fact it might be less smart and round addresses to as much as
+   DEFAULT_ROUNDING.  So we prepare for it to do that.  */
+union fooround {long x; double d;};
+#define DEFAULT_ROUNDING (sizeof (union fooround))
+
+/* When we copy a long block of data, this is the unit to do it with.
+   On some machines, copying successive ints does not work;
+   in such a case, redefine COPYING_UNIT to `long' (if that works)
+   or `char' as a last resort.  */
+#ifndef COPYING_UNIT
+#define COPYING_UNIT int
+#endif
+
+/* The non-GNU-C macros copy the obstack into this global variable
+   to avoid multiple evaluation.  */
+
+struct obstack *_obstack;
+
+/* Define a macro that either calls functions with the traditional malloc/free
+   calling interface, or calls functions with the mmalloc/mfree interface
+   (that adds an extra first argument), based on the state of use_extra_arg.
+   For free, do not use ?:, since some compilers, like the MIPS compilers,
+   do not allow (expr) ? void : void.  */
+
+#define CALL_CHUNKFUN(h, size) \
+  (((h) -> use_extra_arg) \
+   ? (*(h)->chunkfun) ((h)->extra_arg, (size)) \
+   : (*(h)->chunkfun) ((size)))
+
+#define CALL_FREEFUN(h, old_chunk) \
+  do { \
+    if ((h) -> use_extra_arg) \
+      (*(h)->freefun) ((h)->extra_arg, (old_chunk)); \
+    else \
+      (*(h)->freefun) ((old_chunk)); \
+  } while (0)
+
+
+/* Initialize an obstack H for use.  Specify chunk size SIZE (0 means default).
+   Objects start on multiples of ALIGNMENT (0 means use default).
+   CHUNKFUN is the function to use to allocate chunks,
+   and FREEFUN the function to free them.  */
+
+void
+_obstack_begin (h, size, alignment, chunkfun, freefun)
+     struct obstack *h;
+     int size;
+     int alignment;
+     POINTER (*chunkfun) ();
+     void (*freefun) ();
+{
+  register struct _obstack_chunk* chunk; /* points to new chunk */
+
+  if (alignment == 0)
+    alignment = DEFAULT_ALIGNMENT;
+  if (size == 0)
+    /* Default size is what GNU malloc can fit in a 4096-byte block.  */
+    {
+      /* 12 is sizeof (mhead) and 4 is EXTRA from GNU malloc.
+	 Use the values for range checking, because if range checking is off,
+	 the extra bytes won't be missed terribly, but if range checking is on
+	 and we used a larger request, a whole extra 4096 bytes would be
+	 allocated.
+
+	 These number are irrelevant to the new GNU malloc.  I suspect it is
+	 less sensitive to the size of the request.  */
+      int extra = ((((12 + DEFAULT_ROUNDING - 1) & ~(DEFAULT_ROUNDING - 1))
+		    + 4 + DEFAULT_ROUNDING - 1)
+		   & ~(DEFAULT_ROUNDING - 1));
+      size = 4096 - extra;
+    }
+
+  h->chunkfun = (struct _obstack_chunk * (*)()) chunkfun;
+  h->freefun = freefun;
+  h->chunk_size = size;
+  h->alignment_mask = alignment - 1;
+  h->use_extra_arg = 0;
+
+  chunk = h->chunk = CALL_CHUNKFUN (h, h -> chunk_size);
+  h->next_free = h->object_base = chunk->contents;
+  h->chunk_limit = chunk->limit
+    = (char *) chunk + h->chunk_size;
+  chunk->prev = 0;
+  /* The initial chunk now contains no empty object.  */
+  h->maybe_empty_object = 0;
+}
+
+void
+_obstack_begin_1 (h, size, alignment, chunkfun, freefun, arg)
+     struct obstack *h;
+     int size;
+     int alignment;
+     POINTER (*chunkfun) ();
+     void (*freefun) ();
+     POINTER arg;
+{
+  register struct _obstack_chunk* chunk; /* points to new chunk */
+
+  if (alignment == 0)
+    alignment = DEFAULT_ALIGNMENT;
+  if (size == 0)
+    /* Default size is what GNU malloc can fit in a 4096-byte block.  */
+    {
+      /* 12 is sizeof (mhead) and 4 is EXTRA from GNU malloc.
+	 Use the values for range checking, because if range checking is off,
+	 the extra bytes won't be missed terribly, but if range checking is on
+	 and we used a larger request, a whole extra 4096 bytes would be
+	 allocated.
+
+	 These number are irrelevant to the new GNU malloc.  I suspect it is
+	 less sensitive to the size of the request.  */
+      int extra = ((((12 + DEFAULT_ROUNDING - 1) & ~(DEFAULT_ROUNDING - 1))
+		    + 4 + DEFAULT_ROUNDING - 1)
+		   & ~(DEFAULT_ROUNDING - 1));
+      size = 4096 - extra;
+    }
+
+  h->chunkfun = (struct _obstack_chunk * (*)()) chunkfun;
+  h->freefun = freefun;
+  h->chunk_size = size;
+  h->alignment_mask = alignment - 1;
+  h->extra_arg = arg;
+  h->use_extra_arg = 1;
+
+  chunk = h->chunk = CALL_CHUNKFUN (h, h -> chunk_size);
+  h->next_free = h->object_base = chunk->contents;
+  h->chunk_limit = chunk->limit
+    = (char *) chunk + h->chunk_size;
+  chunk->prev = 0;
+  /* The initial chunk now contains no empty object.  */
+  h->maybe_empty_object = 0;
+}
+
+/* Allocate a new current chunk for the obstack *H
+   on the assumption that LENGTH bytes need to be added
+   to the current object, or a new object of length LENGTH allocated.
+   Copies any partial object from the end of the old chunk
+   to the beginning of the new one.  */
+
+void
+_obstack_newchunk (h, length)
+     struct obstack *h;
+     int length;
+{
+  register struct _obstack_chunk*	old_chunk = h->chunk;
+  register struct _obstack_chunk*	new_chunk;
+  register long	new_size;
+  register int obj_size = h->next_free - h->object_base;
+  register int i;
+  int already;
+
+  /* Compute size for new chunk.  */
+  new_size = (obj_size + length) + (obj_size >> 3) + 100;
+  if (new_size < h->chunk_size)
+    new_size = h->chunk_size;
+
+  /* Allocate and initialize the new chunk.  */
+  new_chunk = h->chunk = CALL_CHUNKFUN (h, new_size);
+  new_chunk->prev = old_chunk;
+  new_chunk->limit = h->chunk_limit = (char *) new_chunk + new_size;
+
+  /* Move the existing object to the new chunk.
+     Word at a time is fast and is safe if the object
+     is sufficiently aligned.  */
+  if (h->alignment_mask + 1 >= DEFAULT_ALIGNMENT)
+    {
+      for (i = obj_size / sizeof (COPYING_UNIT) - 1;
+	   i >= 0; i--)
+	((COPYING_UNIT *)new_chunk->contents)[i]
+	  = ((COPYING_UNIT *)h->object_base)[i];
+      /* We used to copy the odd few remaining bytes as one extra COPYING_UNIT,
+	 but that can cross a page boundary on a machine
+	 which does not do strict alignment for COPYING_UNITS.  */
+      already = obj_size / sizeof (COPYING_UNIT) * sizeof (COPYING_UNIT);
+    }
+  else
+    already = 0;
+  /* Copy remaining bytes one by one.  */
+  for (i = already; i < obj_size; i++)
+    new_chunk->contents[i] = h->object_base[i];
+
+  /* If the object just copied was the only data in OLD_CHUNK,
+     free that chunk and remove it from the chain.
+     But not if that chunk might contain an empty object.  */
+  if (h->object_base == old_chunk->contents && ! h->maybe_empty_object)
+    {
+      new_chunk->prev = old_chunk->prev;
+      CALL_FREEFUN (h, old_chunk);
+    }
+
+  h->object_base = new_chunk->contents;
+  h->next_free = h->object_base + obj_size;
+  /* The new chunk certainly contains no empty object yet.  */
+  h->maybe_empty_object = 0;
+}
+
+/* Return nonzero if object OBJ has been allocated from obstack H.
+   This is here for debugging.
+   If you use it in a program, you are probably losing.  */
+
+int
+_obstack_allocated_p (h, obj)
+     struct obstack *h;
+     POINTER obj;
+{
+  register struct _obstack_chunk*  lp;	/* below addr of any objects in this chunk */
+  register struct _obstack_chunk*  plp;	/* point to previous chunk if any */
+
+  lp = (h)->chunk;
+  /* We use >= rather than > since the object cannot be exactly at
+     the beginning of the chunk but might be an empty object exactly
+     at the end of an adjacent chunk. */
+  while (lp != 0 && ((POINTER)lp >= obj || (POINTER)(lp)->limit < obj))
+    {
+      plp = lp->prev;
+      lp = plp;
+    }
+  return lp != 0;
+}
+
+/* Free objects in obstack H, including OBJ and everything allocate
+   more recently than OBJ.  If OBJ is zero, free everything in H.  */
+
+#undef obstack_free
+
+/* This function has two names with identical definitions.
+   This is the first one, called from non-ANSI code.  */
+
+void
+_obstack_free (h, obj)
+     struct obstack *h;
+     POINTER obj;
+{
+  register struct _obstack_chunk*  lp;	/* below addr of any objects in this chunk */
+  register struct _obstack_chunk*  plp;	/* point to previous chunk if any */
+
+  lp = h->chunk;
+  /* We use >= because there cannot be an object at the beginning of a chunk.
+     But there can be an empty object at that address
+     at the end of another chunk.  */
+  while (lp != 0 && ((POINTER)lp >= obj || (POINTER)(lp)->limit < obj))
+    {
+      plp = lp->prev;
+      CALL_FREEFUN (h, lp);
+      lp = plp;
+      /* If we switch chunks, we can't tell whether the new current
+	 chunk contains an empty object, so assume that it may.  */
+      h->maybe_empty_object = 1;
+    }
+  if (lp)
+    {
+      h->object_base = h->next_free = (char *)(obj);
+      h->chunk_limit = lp->limit;
+      h->chunk = lp;
+    }
+  else if (obj != 0)
+    /* obj is not in any of the chunks! */
+    abort ();
+}
+
+/* This function is used from ANSI code.  */
+
+void
+obstack_free (h, obj)
+     struct obstack *h;
+     POINTER obj;
+{
+  register struct _obstack_chunk*  lp;	/* below addr of any objects in this chunk */
+  register struct _obstack_chunk*  plp;	/* point to previous chunk if any */
+
+  lp = h->chunk;
+  /* We use >= because there cannot be an object at the beginning of a chunk.
+     But there can be an empty object at that address
+     at the end of another chunk.  */
+  while (lp != 0 && ((POINTER)lp >= obj || (POINTER)(lp)->limit < obj))
+    {
+      plp = lp->prev;
+      CALL_FREEFUN (h, lp);
+      lp = plp;
+      /* If we switch chunks, we can't tell whether the new current
+	 chunk contains an empty object, so assume that it may.  */
+      h->maybe_empty_object = 1;
+    }
+  if (lp)
+    {
+      h->object_base = h->next_free = (char *)(obj);
+      h->chunk_limit = lp->limit;
+      h->chunk = lp;
+    }
+  else if (obj != 0)
+    /* obj is not in any of the chunks! */
+    abort ();
+}
+
+#if 0
+/* These are now turned off because the applications do not use it
+   and it uses bcopy via obstack_grow, which causes trouble on sysV.  */
+
+/* Now define the functional versions of the obstack macros.
+   Define them to simply use the corresponding macros to do the job.  */
+
+#ifdef __STDC__
+/* These function definitions do not work with non-ANSI preprocessors;
+   they won't pass through the macro names in parentheses.  */
+
+/* The function names appear in parentheses in order to prevent
+   the macro-definitions of the names from being expanded there.  */
+
+POINTER (obstack_base) (obstack)
+     struct obstack *obstack;
+{
+  return obstack_base (obstack);
+}
+
+POINTER (obstack_next_free) (obstack)
+     struct obstack *obstack;
+{
+  return obstack_next_free (obstack);
+}
+
+int (obstack_object_size) (obstack)
+     struct obstack *obstack;
+{
+  return obstack_object_size (obstack);
+}
+
+int (obstack_room) (obstack)
+     struct obstack *obstack;
+{
+  return obstack_room (obstack);
+}
+
+void (obstack_grow) (obstack, pointer, length)
+     struct obstack *obstack;
+     POINTER pointer;
+     int length;
+{
+  obstack_grow (obstack, pointer, length);
+}
+
+void (obstack_grow0) (obstack, pointer, length)
+     struct obstack *obstack;
+     POINTER pointer;
+     int length;
+{
+  obstack_grow0 (obstack, pointer, length);
+}
+
+void (obstack_1grow) (obstack, character)
+     struct obstack *obstack;
+     int character;
+{
+  obstack_1grow (obstack, character);
+}
+
+void (obstack_blank) (obstack, length)
+     struct obstack *obstack;
+     int length;
+{
+  obstack_blank (obstack, length);
+}
+
+void (obstack_1grow_fast) (obstack, character)
+     struct obstack *obstack;
+     int character;
+{
+  obstack_1grow_fast (obstack, character);
+}
+
+void (obstack_blank_fast) (obstack, length)
+     struct obstack *obstack;
+     int length;
+{
+  obstack_blank_fast (obstack, length);
+}
+
+POINTER (obstack_finish) (obstack)
+     struct obstack *obstack;
+{
+  return obstack_finish (obstack);
+}
+
+POINTER (obstack_alloc) (obstack, length)
+     struct obstack *obstack;
+     int length;
+{
+  return obstack_alloc (obstack, length);
+}
+
+POINTER (obstack_copy) (obstack, pointer, length)
+     struct obstack *obstack;
+     POINTER pointer;
+     int length;
+{
+  return obstack_copy (obstack, pointer, length);
+}
+
+POINTER (obstack_copy0) (obstack, pointer, length)
+     struct obstack *obstack;
+     POINTER pointer;
+     int length;
+{
+  return obstack_copy0 (obstack, pointer, length);
+}
+
+#endif /* __STDC__ */
+
+#endif /* 0 */
+
+#endif	/* _LIBC or not __GNU_LIBRARY__.  */
--- a/gnu/usr.bin/grep/obstack.h
+++ b/gnu/usr.bin/grep/obstack.h
@ -0,0 +1,484 @@
+/* obstack.h - object stack macros
+   Copyright (C) 1988, 1992 Free Software Foundation, Inc.
+
+This program is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 2, or (at your option) any
+later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.  */
+
+/* Summary:
+
+All the apparent functions defined here are macros. The idea
+is that you would use these pre-tested macros to solve a
+very specific set of problems, and they would run fast.
+Caution: no side-effects in arguments please!! They may be
+evaluated MANY times!!
+
+These macros operate a stack of objects.  Each object starts life
+small, and may grow to maturity.  (Consider building a word syllable
+by syllable.)  An object can move while it is growing.  Once it has
+been "finished" it never changes address again.  So the "top of the
+stack" is typically an immature growing object, while the rest of the
+stack is of mature, fixed size and fixed address objects.
+
+These routines grab large chunks of memory, using a function you
+supply, called `obstack_chunk_alloc'.  On occasion, they free chunks,
+by calling `obstack_chunk_free'.  You must define them and declare
+them before using any obstack macros.
+
+Each independent stack is represented by a `struct obstack'.
+Each of the obstack macros expects a pointer to such a structure
+as the first argument.
+
+One motivation for this package is the problem of growing char strings
+in symbol tables.  Unless you are "fascist pig with a read-only mind"
+--Gosper's immortal quote from HAKMEM item 154, out of context--you
+would not like to put any arbitrary upper limit on the length of your
+symbols.
+
+In practice this often means you will build many short symbols and a
+few long symbols.  At the time you are reading a symbol you don't know
+how long it is.  One traditional method is to read a symbol into a
+buffer, realloc()ating the buffer every time you try to read a symbol
+that is longer than the buffer.  This is beaut, but you still will
+want to copy the symbol from the buffer to a more permanent
+symbol-table entry say about half the time.
+
+With obstacks, you can work differently.  Use one obstack for all symbol
+names.  As you read a symbol, grow the name in the obstack gradually.
+When the name is complete, finalize it.  Then, if the symbol exists already,
+free the newly read name.
+
+The way we do this is to take a large chunk, allocating memory from
+low addresses.  When you want to build a symbol in the chunk you just
+add chars above the current "high water mark" in the chunk.  When you
+have finished adding chars, because you got to the end of the symbol,
+you know how long the chars are, and you can create a new object.
+Mostly the chars will not burst over the highest address of the chunk,
+because you would typically expect a chunk to be (say) 100 times as
+long as an average object.
+
+In case that isn't clear, when we have enough chars to make up
+the object, THEY ARE ALREADY CONTIGUOUS IN THE CHUNK (guaranteed)
+so we just point to it where it lies.  No moving of chars is
+needed and this is the second win: potentially long strings need
+never be explicitly shuffled. Once an object is formed, it does not
+change its address during its lifetime.
+
+When the chars burst over a chunk boundary, we allocate a larger
+chunk, and then copy the partly formed object from the end of the old
+chunk to the beginning of the new larger chunk.  We then carry on
+accreting characters to the end of the object as we normally would.
+
+A special macro is provided to add a single char at a time to a
+growing object.  This allows the use of register variables, which
+break the ordinary 'growth' macro.
+
+Summary:
+	We allocate large chunks.
+	We carve out one object at a time from the current chunk.
+	Once carved, an object never moves.
+	We are free to append data of any size to the currently
+	  growing object.
+	Exactly one object is growing in an obstack at any one time.
+	You can run one obstack per control block.
+	You may have as many control blocks as you dare.
+	Because of the way we do it, you can `unwind' an obstack
+	  back to a previous state. (You may remove objects much
+	  as you would with a stack.)
+*/
+
+
+/* Don't do the contents of this file more than once.  */
+
+#ifndef __OBSTACKS__
+#define __OBSTACKS__
+
+/* We use subtraction of (char *)0 instead of casting to int
+   because on word-addressable machines a simple cast to int
+   may ignore the byte-within-word field of the pointer.  */
+
+#ifndef __PTR_TO_INT
+#define __PTR_TO_INT(P) ((P) - (char *)0)
+#endif
+
+#ifndef __INT_TO_PTR
+#define __INT_TO_PTR(P) ((P) + (char *)0)
+#endif
+
+/* We need the type of the resulting object.  In ANSI C it is ptrdiff_t
+   but in traditional C it is usually long.  If we are in ANSI C and
+   don't already have ptrdiff_t get it.  */
+
+#if defined (__STDC__) && ! defined (offsetof)
+#if defined (__GNUC__) && defined (IN_GCC)
+/* On Next machine, the system's stddef.h screws up if included
+   after we have defined just ptrdiff_t, so include all of gstddef.h.
+   Otherwise, define just ptrdiff_t, which is all we need.  */
+#ifndef __NeXT__
+#define __need_ptrdiff_t
+#endif
+
+/* While building GCC, the stddef.h that goes with GCC has this name.  */
+#include "gstddef.h"
+#else
+#include <stddef.h>
+#endif
+#endif
+
+#ifdef __STDC__
+#define PTR_INT_TYPE ptrdiff_t
+#else
+#define PTR_INT_TYPE long
+#endif
+
+struct _obstack_chunk		/* Lives at front of each chunk. */
+{
+  char  *limit;			/* 1 past end of this chunk */
+  struct _obstack_chunk *prev;	/* address of prior chunk or NULL */
+  char	contents[4];		/* objects begin here */
+};
+
+struct obstack		/* control current object in current chunk */
+{
+  long	chunk_size;		/* preferred size to allocate chunks in */
+  struct _obstack_chunk* chunk;	/* address of current struct obstack_chunk */
+  char	*object_base;		/* address of object we are building */
+  char	*next_free;		/* where to add next char to current object */
+  char	*chunk_limit;		/* address of char after current chunk */
+  PTR_INT_TYPE temp;		/* Temporary for some macros.  */
+  int   alignment_mask;		/* Mask of alignment for each object. */
+  struct _obstack_chunk *(*chunkfun) (); /* User's fcn to allocate a chunk.  */
+  void (*freefun) ();		/* User's function to free a chunk.  */
+  char *extra_arg;		/* first arg for chunk alloc/dealloc funcs */
+  unsigned use_extra_arg:1;	/* chunk alloc/dealloc funcs take extra arg */
+  unsigned maybe_empty_object:1;/* There is a possibility that the current
+				   chunk contains a zero-length object.  This
+				   prevents freeing the chunk if we allocate
+				   a bigger chunk to replace it. */
+};
+
+/* Declare the external functions we use; they are in obstack.c.  */
+
+#ifdef __STDC__
+extern void _obstack_newchunk (struct obstack *, int);
+extern void _obstack_free (struct obstack *, void *);
+extern void _obstack_begin (struct obstack *, int, int,
+			    void *(*) (), void (*) ());
+extern void _obstack_begin_1 (struct obstack *, int, int,
+			      void *(*) (), void (*) (), void *);
+#else
+extern void _obstack_newchunk ();
+extern void _obstack_free ();
+extern void _obstack_begin ();
+extern void _obstack_begin_1 ();
+#endif
+
+#ifdef __STDC__
+
+/* Do the function-declarations after the structs
+   but before defining the macros.  */
+
+void obstack_init (struct obstack *obstack);
+
+void * obstack_alloc (struct obstack *obstack, int size);
+
+void * obstack_copy (struct obstack *obstack, void *address, int size);
+void * obstack_copy0 (struct obstack *obstack, void *address, int size);
+
+void obstack_free (struct obstack *obstack, void *block);
+
+void obstack_blank (struct obstack *obstack, int size);
+
+void obstack_grow (struct obstack *obstack, void *data, int size);
+void obstack_grow0 (struct obstack *obstack, void *data, int size);
+
+void obstack_1grow (struct obstack *obstack, int data_char);
+void obstack_ptr_grow (struct obstack *obstack, void *data);
+void obstack_int_grow (struct obstack *obstack, int data);
+
+void * obstack_finish (struct obstack *obstack);
+
+int obstack_object_size (struct obstack *obstack);
+
+int obstack_room (struct obstack *obstack);
+void obstack_1grow_fast (struct obstack *obstack, int data_char);
+void obstack_ptr_grow_fast (struct obstack *obstack, void *data);
+void obstack_int_grow_fast (struct obstack *obstack, int data);
+void obstack_blank_fast (struct obstack *obstack, int size);
+
+void * obstack_base (struct obstack *obstack);
+void * obstack_next_free (struct obstack *obstack);
+int obstack_alignment_mask (struct obstack *obstack);
+int obstack_chunk_size (struct obstack *obstack);
+
+#endif /* __STDC__ */
+
+/* Non-ANSI C cannot really support alternative functions for these macros,
+   so we do not declare them.  */
+
+/* Pointer to beginning of object being allocated or to be allocated next.
+   Note that this might not be the final address of the object
+   because a new chunk might be needed to hold the final size.  */
+
+#define obstack_base(h) ((h)->object_base)
+
+/* Size for allocating ordinary chunks.  */
+
+#define obstack_chunk_size(h) ((h)->chunk_size)
+
+/* Pointer to next byte not yet allocated in current chunk.  */
+
+#define obstack_next_free(h)	((h)->next_free)
+
+/* Mask specifying low bits that should be clear in address of an object.  */
+
+#define obstack_alignment_mask(h) ((h)->alignment_mask)
+
+#define obstack_init(h) \
+  _obstack_begin ((h), 0, 0, \
+		  (void *(*) ()) obstack_chunk_alloc, (void (*) ()) obstack_chunk_free)
+
+#define obstack_begin(h, size) \
+  _obstack_begin ((h), (size), 0, \
+		  (void *(*) ()) obstack_chunk_alloc, (void (*) ()) obstack_chunk_free)
+
+#define obstack_specify_allocation(h, size, alignment, chunkfun, freefun) \
+  _obstack_begin ((h), (size), (alignment), \
+		    (void *(*) ()) (chunkfun), (void (*) ()) (freefun))
+
+#define obstack_specify_allocation_with_arg(h, size, alignment, chunkfun, freefun, arg) \
+  _obstack_begin_1 ((h), (size), (alignment), \
+		    (void *(*) ()) (chunkfun), (void (*) ()) (freefun), (arg))
+
+#define obstack_1grow_fast(h,achar) (*((h)->next_free)++ = achar)
+
+#define obstack_blank_fast(h,n) ((h)->next_free += (n))
+
+#if defined (__GNUC__) && defined (__STDC__)
+#if __GNUC__ < 2 || defined(NeXT)
+#define __extension__
+#endif
+
+/* For GNU C, if not -traditional,
+   we can define these macros to compute all args only once
+   without using a global variable.
+   Also, we can avoid using the `temp' slot, to make faster code.  */
+
+#define obstack_object_size(OBSTACK)					\
+  __extension__								\
+  ({ struct obstack *__o = (OBSTACK);					\
+     (unsigned) (__o->next_free - __o->object_base); })
+
+#define obstack_room(OBSTACK)						\
+  __extension__								\
+  ({ struct obstack *__o = (OBSTACK);					\
+     (unsigned) (__o->chunk_limit - __o->next_free); })
+
+/* Note that the call to _obstack_newchunk is enclosed in (..., 0)
+   so that we can avoid having void expressions
+   in the arms of the conditional expression.
+   Casting the third operand to void was tried before,
+   but some compilers won't accept it.  */
+#define obstack_grow(OBSTACK,where,length)				\
+__extension__								\
+({ struct obstack *__o = (OBSTACK);					\
+   int __len = (length);						\
+   ((__o->next_free + __len > __o->chunk_limit)				\
+    ? (_obstack_newchunk (__o, __len), 0) : 0);				\
+   bcopy (where, __o->next_free, __len);				\
+   __o->next_free += __len;						\
+   (void) 0; })
+
+#define obstack_grow0(OBSTACK,where,length)				\
+__extension__								\
+({ struct obstack *__o = (OBSTACK);					\
+   int __len = (length);						\
+   ((__o->next_free + __len + 1 > __o->chunk_limit)			\
+    ? (_obstack_newchunk (__o, __len + 1), 0) : 0),			\
+   bcopy (where, __o->next_free, __len),				\
+   __o->next_free += __len,						\
+   *(__o->next_free)++ = 0;						\
+   (void) 0; })
+
+#define obstack_1grow(OBSTACK,datum)					\
+__extension__								\
+({ struct obstack *__o = (OBSTACK);					\
+   ((__o->next_free + 1 > __o->chunk_limit)				\
+    ? (_obstack_newchunk (__o, 1), 0) : 0),				\
+   *(__o->next_free)++ = (datum);					\
+   (void) 0; })
+
+/* These assume that the obstack alignment is good enough for pointers or ints,
+   and that the data added so far to the current object
+   shares that much alignment.  */
+   
+#define obstack_ptr_grow(OBSTACK,datum)					\
+__extension__								\
+({ struct obstack *__o = (OBSTACK);					\
+   ((__o->next_free + sizeof (void *) > __o->chunk_limit)		\
+    ? (_obstack_newchunk (__o, sizeof (void *)), 0) : 0),		\
+   *((void **)__o->next_free)++ = ((void *)datum);			\
+   (void) 0; })
+
+#define obstack_int_grow(OBSTACK,datum)					\
+__extension__								\
+({ struct obstack *__o = (OBSTACK);					\
+   ((__o->next_free + sizeof (int) > __o->chunk_limit)			\
+    ? (_obstack_newchunk (__o, sizeof (int)), 0) : 0),			\
+   *((int *)__o->next_free)++ = ((int)datum);				\
+   (void) 0; })
+
+#define obstack_ptr_grow_fast(h,aptr) (*((void **)(h)->next_free)++ = (void *)aptr)
+#define obstack_int_grow_fast(h,aint) (*((int *)(h)->next_free)++ = (int)aint)
+
+#define obstack_blank(OBSTACK,length)					\
+__extension__								\
+({ struct obstack *__o = (OBSTACK);					\
+   int __len = (length);						\
+   ((__o->chunk_limit - __o->next_free < __len)				\
+    ? (_obstack_newchunk (__o, __len), 0) : 0);				\
+   __o->next_free += __len;						\
+   (void) 0; })
+
+#define obstack_alloc(OBSTACK,length)					\
+__extension__								\
+({ struct obstack *__h = (OBSTACK);					\
+   obstack_blank (__h, (length));					\
+   obstack_finish (__h); })
+
+#define obstack_copy(OBSTACK,where,length)				\
+__extension__								\
+({ struct obstack *__h = (OBSTACK);					\
+   obstack_grow (__h, (where), (length));				\
+   obstack_finish (__h); })
+
+#define obstack_copy0(OBSTACK,where,length)				\
+__extension__								\
+({ struct obstack *__h = (OBSTACK);					\
+   obstack_grow0 (__h, (where), (length));				\
+   obstack_finish (__h); })
+
+/* The local variable is named __o1 to avoid a name conflict
+   when obstack_blank is called.  */
+#define obstack_finish(OBSTACK)  					\
+__extension__								\
+({ struct obstack *__o1 = (OBSTACK);					\
+   void *value = (void *) __o1->object_base;				\
+   if (__o1->next_free == value)					\
+     __o1->maybe_empty_object = 1;					\
+   __o1->next_free							\
+     = __INT_TO_PTR ((__PTR_TO_INT (__o1->next_free)+__o1->alignment_mask)\
+		     & ~ (__o1->alignment_mask));			\
+   ((__o1->next_free - (char *)__o1->chunk				\
+     > __o1->chunk_limit - (char *)__o1->chunk)				\
+    ? (__o1->next_free = __o1->chunk_limit) : 0);			\
+   __o1->object_base = __o1->next_free;					\
+   value; })
+
+#define obstack_free(OBSTACK, OBJ)					\
+__extension__								\
+({ struct obstack *__o = (OBSTACK);					\
+   void *__obj = (OBJ);							\
+   if (__obj > (void *)__o->chunk && __obj < (void *)__o->chunk_limit)  \
+     __o->next_free = __o->object_base = __obj;				\
+   else (obstack_free) (__o, __obj); })
+
+#else /* not __GNUC__ or not __STDC__ */
+
+#define obstack_object_size(h) \
+ (unsigned) ((h)->next_free - (h)->object_base)
+
+#define obstack_room(h)		\
+ (unsigned) ((h)->chunk_limit - (h)->next_free)
+
+#define obstack_grow(h,where,length)					\
+( (h)->temp = (length),							\
+  (((h)->next_free + (h)->temp > (h)->chunk_limit)			\
+   ? (_obstack_newchunk ((h), (h)->temp), 0) : 0),			\
+  bcopy (where, (h)->next_free, (h)->temp),				\
+  (h)->next_free += (h)->temp)
+
+#define obstack_grow0(h,where,length)					\
+( (h)->temp = (length),							\
+  (((h)->next_free + (h)->temp + 1 > (h)->chunk_limit)			\
+   ? (_obstack_newchunk ((h), (h)->temp + 1), 0) : 0),			\
+  bcopy (where, (h)->next_free, (h)->temp),				\
+  (h)->next_free += (h)->temp,						\
+  *((h)->next_free)++ = 0)
+
+#define obstack_1grow(h,datum)						\
+( (((h)->next_free + 1 > (h)->chunk_limit)				\
+   ? (_obstack_newchunk ((h), 1), 0) : 0),				\
+  *((h)->next_free)++ = (datum))
+
+#define obstack_ptr_grow(h,datum)					\
+( (((h)->next_free + sizeof (char *) > (h)->chunk_limit)		\
+   ? (_obstack_newchunk ((h), sizeof (char *)), 0) : 0),		\
+  *((char **)(((h)->next_free+=sizeof(char *))-sizeof(char *))) = ((char *)datum))
+
+#define obstack_int_grow(h,datum)					\
+( (((h)->next_free + sizeof (int) > (h)->chunk_limit)			\
+   ? (_obstack_newchunk ((h), sizeof (int)), 0) : 0),			\
+  *((int *)(((h)->next_free+=sizeof(int))-sizeof(int))) = ((int)datum))
+
+#define obstack_ptr_grow_fast(h,aptr) (*((char **)(h)->next_free)++ = (char *)aptr)
+#define obstack_int_grow_fast(h,aint) (*((int *)(h)->next_free)++ = (int)aint)
+
+#define obstack_blank(h,length)						\
+( (h)->temp = (length),							\
+  (((h)->chunk_limit - (h)->next_free < (h)->temp)			\
+   ? (_obstack_newchunk ((h), (h)->temp), 0) : 0),			\
+  (h)->next_free += (h)->temp)
+
+#define obstack_alloc(h,length)						\
+ (obstack_blank ((h), (length)), obstack_finish ((h)))
+
+#define obstack_copy(h,where,length)					\
+ (obstack_grow ((h), (where), (length)), obstack_finish ((h)))
+
+#define obstack_copy0(h,where,length)					\
+ (obstack_grow0 ((h), (where), (length)), obstack_finish ((h)))
+
+#define obstack_finish(h)  						\
+( ((h)->next_free == (h)->object_base					\
+   ? (((h)->maybe_empty_object = 1), 0)					\
+   : 0),								\
+  (h)->temp = __PTR_TO_INT ((h)->object_base),				\
+  (h)->next_free							\
+    = __INT_TO_PTR ((__PTR_TO_INT ((h)->next_free)+(h)->alignment_mask)	\
+		    & ~ ((h)->alignment_mask)),				\
+  (((h)->next_free - (char *)(h)->chunk					\
+    > (h)->chunk_limit - (char *)(h)->chunk)				\
+   ? ((h)->next_free = (h)->chunk_limit) : 0),				\
+  (h)->object_base = (h)->next_free,					\
+  __INT_TO_PTR ((h)->temp))
+
+#ifdef __STDC__
+#define obstack_free(h,obj)						\
+( (h)->temp = (char *)(obj) - (char *) (h)->chunk,			\
+  (((h)->temp > 0 && (h)->temp < (h)->chunk_limit - (char *) (h)->chunk)\
+   ? (int) ((h)->next_free = (h)->object_base				\
+	    = (h)->temp + (char *) (h)->chunk)				\
+   : (((obstack_free) ((h), (h)->temp + (char *) (h)->chunk), 0), 0)))
+#else
+#define obstack_free(h,obj)						\
+( (h)->temp = (char *)(obj) - (char *) (h)->chunk,			\
+  (((h)->temp > 0 && (h)->temp < (h)->chunk_limit - (char *) (h)->chunk)\
+   ? (int) ((h)->next_free = (h)->object_base				\
+	    = (h)->temp + (char *) (h)->chunk)				\
+   : (_obstack_free ((h), (h)->temp + (char *) (h)->chunk), 0)))
+#endif
+
+#endif /* not __GNUC__ or not __STDC__ */
+
+#endif /* not __OBSTACKS__ */
--- a/gnu/usr.bin/grep/regex.c
+++ b/gnu/usr.bin/grep/regex.c
--- a/gnu/usr.bin/grep/regex.h
+++ b/gnu/usr.bin/grep/regex.h
@ -1,5 +1,7 @@
-/* Definitions for data structures callers pass the regex library.
-   Copyright (C) 1985, 1989 Free Software Foundation, Inc.
+/* Definitions for data structures and routines for the regular
+   expression library, version 0.12.
+
+   Copyright (C) 1985, 1989, 1990, 1991, 1992, 1993 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
@ -13,173 +15,476 @@

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
-   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  */

+#ifndef __REGEXP_LIBRARY_H__
+#define __REGEXP_LIBRARY_H__

-   In other words, you are welcome to use, share and improve this program.
-   You are forbidden to forbid anyone else to use, share and improve
-   what you give them.   Help stamp out software-hoarding!  */
+/* POSIX says that <sys/types.h> must be included (by the caller) before
+   <regex.h>.  */

-
-/* Define number of parens for which we record the beginnings and ends.
-   This affects how much space the `struct re_registers' type takes up.  */
-#ifndef RE_NREGS
-#define RE_NREGS 10
+#ifdef VMS
+/* VMS doesn't have `size_t' in <sys/types.h>, even though POSIX says it
+   should be there.  */
+#include <stddef.h>
 #endif

-/* These bits are used in the obscure_syntax variable to choose among
-   alternative regexp syntaxes.  */

-/* 1 means plain parentheses serve as grouping, and backslash
-     parentheses are needed for literal searching.
-   0 means backslash-parentheses are grouping, and plain parentheses
-     are for literal searching.  */
-#define RE_NO_BK_PARENS 1
+/* The following bits are used to determine the regexp syntax we
+   recognize.  The set/not-set meanings are chosen so that Emacs syntax
+   remains the value 0.  The bits are given in alphabetical order, and
+   the definitions shifted by one from the previous bit; thus, when we
+   add or remove a bit, only one other definition need change.  */
+typedef unsigned reg_syntax_t;

-/* 1 means plain | serves as the "or"-operator, and \| is a literal.
-   0 means \| serves as the "or"-operator, and | is a literal.  */
-#define RE_NO_BK_VBAR 2
+/* If this bit is not set, then \ inside a bracket expression is literal.
+   If set, then such a \ quotes the following character.  */
+#define RE_BACKSLASH_ESCAPE_IN_LISTS (1)

-/* 0 means plain + or ? serves as an operator, and \+, \? are literals.
-   1 means \+, \? are operators and plain +, ? are literals.  */
-#define RE_BK_PLUS_QM 4
+/* If this bit is not set, then + and ? are operators, and \+ and \? are
+     literals. 
+   If set, then \+ and \? are operators and + and ? are literals.  */
+#define RE_BK_PLUS_QM (RE_BACKSLASH_ESCAPE_IN_LISTS << 1)

-/* 1 means | binds tighter than ^ or $.
-   0 means the contrary.  */
-#define RE_TIGHT_VBAR 8
+/* If this bit is set, then character classes are supported.  They are:
+     [:alpha:], [:upper:], [:lower:],  [:digit:], [:alnum:], [:xdigit:],
+     [:space:], [:print:], [:punct:], [:graph:], and [:cntrl:].
+   If not set, then character classes are not supported.  */
+#define RE_CHAR_CLASSES (RE_BK_PLUS_QM << 1)

-/* 1 means treat \n as an _OR operator
-   0 means treat it as a normal character */
-#define RE_NEWLINE_OR 16
+/* If this bit is set, then ^ and $ are always anchors (outside bracket
+     expressions, of course).
+   If this bit is not set, then it depends:
+        ^  is an anchor if it is at the beginning of a regular
+           expression or after an open-group or an alternation operator;
+        $  is an anchor if it is at the end of a regular expression, or
+           before a close-group or an alternation operator.  

-/* 0 means that a special characters (such as *, ^, and $) always have
-     their special meaning regardless of the surrounding context.
-   1 means that special characters may act as normal characters in some
-     contexts.  Specifically, this applies to:
-	^ - only special at the beginning, or after ( or |
-	$ - only special at the end, or before ) or |
-	*, +, ? - only special when not after the beginning, (, or | */
-#define RE_CONTEXT_INDEP_OPS 32
+   This bit could be (re)combined with RE_CONTEXT_INDEP_OPS, because
+   POSIX draft 11.2 says that * etc. in leading positions is undefined.
+   We already implemented a previous draft which made those constructs
+   invalid, though, so we haven't changed the code back.  */
+#define RE_CONTEXT_INDEP_ANCHORS (RE_CHAR_CLASSES << 1)

-/* Now define combinations of bits for the standard possibilities.  */
-#define RE_SYNTAX_AWK (RE_NO_BK_PARENS | RE_NO_BK_VBAR | RE_CONTEXT_INDEP_OPS)
-#define RE_SYNTAX_EGREP (RE_SYNTAX_AWK | RE_NEWLINE_OR)
-#define RE_SYNTAX_GREP (RE_BK_PLUS_QM | RE_NEWLINE_OR)
+/* If this bit is set, then special characters are always special
+     regardless of where they are in the pattern.
+   If this bit is not set, then special characters are special only in
+     some contexts; otherwise they are ordinary.  Specifically, 
+     * + ? and intervals are only special when not after the beginning,
+     open-group, or alternation operator.  */
+#define RE_CONTEXT_INDEP_OPS (RE_CONTEXT_INDEP_ANCHORS << 1)
+
+/* If this bit is set, then *, +, ?, and { cannot be first in an re or
+     immediately after an alternation or begin-group operator.  */
+#define RE_CONTEXT_INVALID_OPS (RE_CONTEXT_INDEP_OPS << 1)
+
+/* If this bit is set, then . matches newline.
+   If not set, then it doesn't.  */
+#define RE_DOT_NEWLINE (RE_CONTEXT_INVALID_OPS << 1)
+
+/* If this bit is set, then . doesn't match NUL.
+   If not set, then it does.  */
+#define RE_DOT_NOT_NULL (RE_DOT_NEWLINE << 1)
+
+/* If this bit is set, nonmatching lists [^...] do not match newline.
+   If not set, they do.  */
+#define RE_HAT_LISTS_NOT_NEWLINE (RE_DOT_NOT_NULL << 1)
+
+/* If this bit is set, either \{...\} or {...} defines an
+     interval, depending on RE_NO_BK_BRACES. 
+   If not set, \{, \}, {, and } are literals.  */
+#define RE_INTERVALS (RE_HAT_LISTS_NOT_NEWLINE << 1)
+
+/* If this bit is set, +, ? and | aren't recognized as operators.
+   If not set, they are.  */
+#define RE_LIMITED_OPS (RE_INTERVALS << 1)
+
+/* If this bit is set, newline is an alternation operator.
+   If not set, newline is literal.  */
+#define RE_NEWLINE_ALT (RE_LIMITED_OPS << 1)
+
+/* If this bit is set, then `{...}' defines an interval, and \{ and \}
+     are literals.
+  If not set, then `\{...\}' defines an interval.  */
+#define RE_NO_BK_BRACES (RE_NEWLINE_ALT << 1)
+
+/* If this bit is set, (...) defines a group, and \( and \) are literals.
+   If not set, \(...\) defines a group, and ( and ) are literals.  */
+#define RE_NO_BK_PARENS (RE_NO_BK_BRACES << 1)
+
+/* If this bit is set, then \<digit> matches <digit>.
+   If not set, then \<digit> is a back-reference.  */
+#define RE_NO_BK_REFS (RE_NO_BK_PARENS << 1)
+
+/* If this bit is set, then | is an alternation operator, and \| is literal. 
+   If not set, then \| is an alternation operator, and | is literal.  */
+#define RE_NO_BK_VBAR (RE_NO_BK_REFS << 1)
+
+/* If this bit is set, then an ending range point collating higher
+     than the starting range point, as in [z-a], is invalid.
+   If not set, then when ending range point collates higher than the
+     starting range point, the range is ignored.  */
+#define RE_NO_EMPTY_RANGES (RE_NO_BK_VBAR << 1)
+
+/* If this bit is set, then an unmatched ) is ordinary.
+   If not set, then an unmatched ) is invalid.  */
+#define RE_UNMATCHED_RIGHT_PAREN_ORD (RE_NO_EMPTY_RANGES << 1)
+
+/* This global variable defines the particular regexp syntax to use (for
+   some interfaces).  When a regexp is compiled, the syntax used is
+   stored in the pattern buffer, so changing this does not affect
+   already-compiled regexps.  */
+extern reg_syntax_t re_syntax_options;
+
+/* Define combinations of the above bits for the standard possibilities.
+   (The [[[ comments delimit what gets put into the Texinfo file, so
+   don't delete them!)  */ 
+/* [[[begin syntaxes]]] */
 #define RE_SYNTAX_EMACS 0

-/* This data structure is used to represent a compiled pattern. */
+#define RE_SYNTAX_AWK							\
+  (RE_BACKSLASH_ESCAPE_IN_LISTS | RE_DOT_NOT_NULL			\
+   | RE_NO_BK_PARENS            | RE_NO_BK_REFS				\
+   | RE_NO_BK_VBAR               | RE_NO_EMPTY_RANGES			\
+   | RE_UNMATCHED_RIGHT_PAREN_ORD)
+
+#define RE_SYNTAX_POSIX_AWK 						\
+  (RE_SYNTAX_POSIX_EXTENDED | RE_BACKSLASH_ESCAPE_IN_LISTS)
+
+#define RE_SYNTAX_GREP							\
+  (RE_BK_PLUS_QM              | RE_CHAR_CLASSES				\
+   | RE_HAT_LISTS_NOT_NEWLINE | RE_INTERVALS				\
+   | RE_NEWLINE_ALT)
+
+#define RE_SYNTAX_EGREP							\
+  (RE_CHAR_CLASSES        | RE_CONTEXT_INDEP_ANCHORS			\
+   | RE_CONTEXT_INDEP_OPS | RE_HAT_LISTS_NOT_NEWLINE			\
+   | RE_NEWLINE_ALT       | RE_NO_BK_PARENS				\
+   | RE_NO_BK_VBAR)
+
+#define RE_SYNTAX_POSIX_EGREP						\
+  (RE_SYNTAX_EGREP | RE_INTERVALS | RE_NO_BK_BRACES)
+
+/* P1003.2/D11.2, section 4.20.7.1, lines 5078ff.  */
+#define RE_SYNTAX_ED RE_SYNTAX_POSIX_BASIC
+
+#define RE_SYNTAX_SED RE_SYNTAX_POSIX_BASIC
+
+/* Syntax bits common to both basic and extended POSIX regex syntax.  */
+#define _RE_SYNTAX_POSIX_COMMON						\
+  (RE_CHAR_CLASSES | RE_DOT_NEWLINE      | RE_DOT_NOT_NULL		\
+   | RE_INTERVALS  | RE_NO_EMPTY_RANGES)
+
+#define RE_SYNTAX_POSIX_BASIC						\
+  (_RE_SYNTAX_POSIX_COMMON | RE_BK_PLUS_QM)
+
+/* Differs from ..._POSIX_BASIC only in that RE_BK_PLUS_QM becomes
+   RE_LIMITED_OPS, i.e., \? \+ \| are not recognized.  Actually, this
+   isn't minimal, since other operators, such as \`, aren't disabled.  */
+#define RE_SYNTAX_POSIX_MINIMAL_BASIC					\
+  (_RE_SYNTAX_POSIX_COMMON | RE_LIMITED_OPS)
+
+#define RE_SYNTAX_POSIX_EXTENDED					\
+  (_RE_SYNTAX_POSIX_COMMON | RE_CONTEXT_INDEP_ANCHORS			\
+   | RE_CONTEXT_INDEP_OPS  | RE_NO_BK_BRACES				\
+   | RE_NO_BK_PARENS       | RE_NO_BK_VBAR				\
+   | RE_UNMATCHED_RIGHT_PAREN_ORD)
+
+/* Differs from ..._POSIX_EXTENDED in that RE_CONTEXT_INVALID_OPS
+   replaces RE_CONTEXT_INDEP_OPS and RE_NO_BK_REFS is added.  */
+#define RE_SYNTAX_POSIX_MINIMAL_EXTENDED				\
+  (_RE_SYNTAX_POSIX_COMMON  | RE_CONTEXT_INDEP_ANCHORS			\
+   | RE_CONTEXT_INVALID_OPS | RE_NO_BK_BRACES				\
+   | RE_NO_BK_PARENS        | RE_NO_BK_REFS				\
+   | RE_NO_BK_VBAR	    | RE_UNMATCHED_RIGHT_PAREN_ORD)
+/* [[[end syntaxes]]] */
+
+/* Maximum number of duplicates an interval can allow.  Some systems
+   (erroneously) define this in other header files, but we want our
+   value, so remove any previous define.  */
+#ifdef RE_DUP_MAX
+#undef RE_DUP_MAX
+#endif
+#define RE_DUP_MAX ((1 << 15) - 1) 
+
+
+/* POSIX `cflags' bits (i.e., information for `regcomp').  */
+
+/* If this bit is set, then use extended regular expression syntax.
+   If not set, then use basic regular expression syntax.  */
+#define REG_EXTENDED 1
+
+/* If this bit is set, then ignore case when matching.
+   If not set, then case is significant.  */
+#define REG_ICASE (REG_EXTENDED << 1)
+ 
+/* If this bit is set, then anchors do not match at newline
+     characters in the string.
+   If not set, then anchors do match at newlines.  */
+#define REG_NEWLINE (REG_ICASE << 1)
+
+/* If this bit is set, then report only success or fail in regexec.
+   If not set, then returns differ between not matching and errors.  */
+#define REG_NOSUB (REG_NEWLINE << 1)
+
+
+/* POSIX `eflags' bits (i.e., information for regexec).  */
+
+/* If this bit is set, then the beginning-of-line operator doesn't match
+     the beginning of the string (presumably because it's not the
+     beginning of a line).
+   If not set, then the beginning-of-line operator does match the
+     beginning of the string.  */
+#define REG_NOTBOL 1
+
+/* Like REG_NOTBOL, except for the end-of-line.  */
+#define REG_NOTEOL (1 << 1)
+
+
+/* If any error codes are removed, changed, or added, update the
+   `re_error_msg' table in regex.c.  */
+typedef enum
+{
+  REG_NOERROR = 0,	/* Success.  */
+  REG_NOMATCH,		/* Didn't find a match (for regexec).  */
+
+  /* POSIX regcomp return error codes.  (In the order listed in the
+     standard.)  */
+  REG_BADPAT,		/* Invalid pattern.  */
+  REG_ECOLLATE,		/* Not implemented.  */
+  REG_ECTYPE,		/* Invalid character class name.  */
+  REG_EESCAPE,		/* Trailing backslash.  */
+  REG_ESUBREG,		/* Invalid back reference.  */
+  REG_EBRACK,		/* Unmatched left bracket.  */
+  REG_EPAREN,		/* Parenthesis imbalance.  */ 
+  REG_EBRACE,		/* Unmatched \{.  */
+  REG_BADBR,		/* Invalid contents of \{\}.  */
+  REG_ERANGE,		/* Invalid range end.  */
+  REG_ESPACE,		/* Ran out of memory.  */
+  REG_BADRPT,		/* No preceding re for repetition op.  */
+
+  /* Error codes we've added.  */
+  REG_EEND,		/* Premature end.  */
+  REG_ESIZE,		/* Compiled pattern bigger than 2^16 bytes.  */
+  REG_ERPAREN		/* Unmatched ) or \); not returned from regcomp.  */
+} reg_errcode_t;
+
+/* This data structure represents a compiled pattern.  Before calling
+   the pattern compiler, the fields `buffer', `allocated', `fastmap',
+   `translate', and `no_sub' can be set.  After the pattern has been
+   compiled, the `re_nsub' field is available.  All other fields are
+   private to the regex routines.  */

 struct re_pattern_buffer
 {
-    char *buffer;	/* Space holding the compiled pattern commands. */
-    int allocated;	/* Size of space that  buffer  points to */
-    int used;		/* Length of portion of buffer actually occupied */
-    char *fastmap;	/* Pointer to fastmap, if any, or zero if none. */
-			/* re_search uses the fastmap, if there is one,
-			   to skip quickly over totally implausible characters */
-    char *translate;	/* Translate table to apply to all characters before comparing.
-			   Or zero for no translation.
-			   The translation is applied to a pattern when it is compiled
-			   and to data when it is matched. */
-    char fastmap_accurate;
-			/* Set to zero when a new pattern is stored,
-			   set to one when the fastmap is updated from it. */
-    char can_be_null;   /* Set to one by compiling fastmap
-			   if this pattern might match the null string.
-			   It does not necessarily match the null string
-			   in that case, but if this is zero, it cannot.
-			   2 as value means can match null string
-			   but at end of range or before a character
-			   listed in the fastmap.  */
+/* [[[begin pattern_buffer]]] */
+	/* Space that holds the compiled pattern.  It is declared as
+          `unsigned char *' because its elements are
+           sometimes used as array indexes.  */
+  unsigned char *buffer;
+
+	/* Number of bytes to which `buffer' points.  */
+  unsigned long allocated;
+
+	/* Number of bytes actually used in `buffer'.  */
+  unsigned long used;	
+
+        /* Syntax setting with which the pattern was compiled.  */
+  reg_syntax_t syntax;
+
+        /* Pointer to a fastmap, if any, otherwise zero.  re_search uses
+           the fastmap, if there is one, to skip over impossible
+           starting points for matches.  */
+  char *fastmap;
+
+        /* Either a translate table to apply to all characters before
+           comparing them, or zero for no translation.  The translation
+           is applied to a pattern when it is compiled and to a string
+           when it is matched.  */
+  char *translate;
+
+	/* Number of subexpressions found by the compiler.  */
+  size_t re_nsub;
+
+        /* Zero if this pattern cannot match the empty string, one else.
+           Well, in truth it's used only in `re_search_2', to see
+           whether or not we should use the fastmap, so we don't set
+           this absolutely perfectly; see `re_compile_fastmap' (the
+           `duplicate' case).  */
+  unsigned can_be_null : 1;
+
+        /* If REGS_UNALLOCATED, allocate space in the `regs' structure
+             for `max (RE_NREGS, re_nsub + 1)' groups.
+           If REGS_REALLOCATE, reallocate space if necessary.
+           If REGS_FIXED, use what's there.  */
+#define REGS_UNALLOCATED 0
+#define REGS_REALLOCATE 1
+#define REGS_FIXED 2
+  unsigned regs_allocated : 2;
+
+        /* Set to zero when `regex_compile' compiles a pattern; set to one
+           by `re_compile_fastmap' if it updates the fastmap.  */
+  unsigned fastmap_accurate : 1;
+
+        /* If set, `re_match_2' does not return information about
+           subexpressions.  */
+  unsigned no_sub : 1;
+
+        /* If set, a beginning-of-line anchor doesn't match at the
+           beginning of the string.  */ 
+  unsigned not_bol : 1;
+
+        /* Similarly for an end-of-line anchor.  */
+  unsigned not_eol : 1;
+
+        /* If true, an anchor at a newline matches.  */
+  unsigned newline_anchor : 1;
+
+/* [[[end pattern_buffer]]] */
 };

-/* Structure to store "register" contents data in.
+typedef struct re_pattern_buffer regex_t;

-   Pass the address of such a structure as an argument to re_match, etc.,
-   if you want this information back.

-   start[i] and end[i] record the string matched by \( ... \) grouping i,
-   for i from 1 to RE_NREGS - 1.
-   start[0] and end[0] record the entire string matched. */
+/* search.c (search_buffer) in Emacs needs this one opcode value.  It is
+   defined both in `regex.c' and here.  */
+#define RE_EXACTN_VALUE 1
+
+/* Type for byte offsets within the string.  POSIX mandates this.  */
+typedef int regoff_t;

+
+/* This is the structure we store register match data in.  See
+   regex.texinfo for a full description of what registers match.  */
 struct re_registers
 {
-    int start[RE_NREGS];
-    int end[RE_NREGS];
+  unsigned num_regs;
+  regoff_t *start;
+  regoff_t *end;
 };

-/* These are the command codes that appear in compiled regular expressions, one per byte.
-  Some command codes are followed by argument bytes.
-  A command code can specify any interpretation whatever for its arguments.
-  Zero-bytes may appear in the compiled regular expression. */

-enum regexpcode
-  {
-    unused,
-    exactn,    /* followed by one byte giving n, and then by n literal bytes */
-    begline,   /* fails unless at beginning of line */
-    endline,   /* fails unless at end of line */
-    jump,	 /* followed by two bytes giving relative address to jump to */
-    on_failure_jump,	 /* followed by two bytes giving relative address of place
-		            to resume at in case of failure. */
-    finalize_jump,	 /* Throw away latest failure point and then jump to address. */
-    maybe_finalize_jump, /* Like jump but finalize if safe to do so.
-			    This is used to jump back to the beginning
-			    of a repeat.  If the command that follows
-			    this jump is clearly incompatible with the
-			    one at the beginning of the repeat, such that
-			    we can be sure that there is no use backtracking
-			    out of repetitions already completed,
-			    then we finalize. */
-    dummy_failure_jump,  /* jump, and push a dummy failure point.
-			    This failure point will be thrown away
-			    if an attempt is made to use it for a failure.
-			    A + construct makes this before the first repeat.  */
-    anychar,	 /* matches any one character */
-    charset,     /* matches any one char belonging to specified set.
-		    First following byte is # bitmap bytes.
-		    Then come bytes for a bit-map saying which chars are in.
-		    Bits in each byte are ordered low-bit-first.
-		    A character is in the set if its bit is 1.
-		    A character too large to have a bit in the map
-		    is automatically not in the set */
-    charset_not, /* similar but match any character that is NOT one of those specified */
-    start_memory, /* starts remembering the text that is matched
-		    and stores it in a memory register.
-		    followed by one byte containing the register number.
-		    Register numbers must be in the range 0 through NREGS. */
-    stop_memory, /* stops remembering the text that is matched
-		    and stores it in a memory register.
-		    followed by one byte containing the register number.
-		    Register numbers must be in the range 0 through NREGS. */
-    duplicate,    /* match a duplicate of something remembered.
-		    Followed by one byte containing the index of the memory register. */
-    before_dot,	 /* Succeeds if before dot */
-    at_dot,	 /* Succeeds if at dot */
-    after_dot,	 /* Succeeds if after dot */
-    begbuf,      /* Succeeds if at beginning of buffer */
-    endbuf,      /* Succeeds if at end of buffer */
-    wordchar,    /* Matches any word-constituent character */
-    notwordchar, /* Matches any char that is not a word-constituent */
-    wordbeg,	 /* Succeeds if at word beginning */
-    wordend,	 /* Succeeds if at word end */
-    wordbound,   /* Succeeds if at a word boundary */
-    notwordbound, /* Succeeds if not at a word boundary */
-    syntaxspec,  /* Matches any character whose syntax is specified.
-		    followed by a byte which contains a syntax code, Sword or such like */
-    notsyntaxspec /* Matches any character whose syntax differs from the specified. */
-  };
-
-extern char *re_compile_pattern ();
-/* Is this really advertised? */
-extern void re_compile_fastmap ();
-extern int re_search (), re_search_2 ();
-extern int re_match (), re_match_2 ();
-
-/* 4.2 bsd compatibility (yuck) */
-extern char *re_comp ();
-extern int re_exec ();
-
-#ifdef SYNTAX_TABLE
-extern char *re_syntax_table;
+/* If `regs_allocated' is REGS_UNALLOCATED in the pattern buffer,
+   `re_match_2' returns information about at least this many registers
+   the first time a `regs' structure is passed.  */
+#ifndef RE_NREGS
+#define RE_NREGS 30
 #endif
+
+
+/* POSIX specification for registers.  Aside from the different names than
+   `re_registers', POSIX uses an array of structures, instead of a
+   structure of arrays.  */
+typedef struct
+{
+  regoff_t rm_so;  /* Byte offset from string's start to substring's start.  */
+  regoff_t rm_eo;  /* Byte offset from string's start to substring's end.  */
+} regmatch_t;
+
+/* Declarations for routines.  */
+
+/* To avoid duplicating every routine declaration -- once with a
+   prototype (if we are ANSI), and once without (if we aren't) -- we
+   use the following macro to declare argument types.  This
+   unfortunately clutters up the declarations a bit, but I think it's
+   worth it.  */
+
+#if __STDC__
+
+#define _RE_ARGS(args) args
+
+#else /* not __STDC__ */
+
+#define _RE_ARGS(args) ()
+
+#endif /* not __STDC__ */
+
+/* Sets the current default syntax to SYNTAX, and return the old syntax.
+   You can also simply assign to the `re_syntax_options' variable.  */
+extern reg_syntax_t re_set_syntax _RE_ARGS ((reg_syntax_t syntax));
+
+/* Compile the regular expression PATTERN, with length LENGTH
+   and syntax given by the global `re_syntax_options', into the buffer
+   BUFFER.  Return NULL if successful, and an error string if not.  */
+extern const char *re_compile_pattern
+  _RE_ARGS ((const char *pattern, int length,
+             struct re_pattern_buffer *buffer));
+
+
+/* Compile a fastmap for the compiled pattern in BUFFER; used to
+   accelerate searches.  Return 0 if successful and -2 if was an
+   internal error.  */
+extern int re_compile_fastmap _RE_ARGS ((struct re_pattern_buffer *buffer));
+
+
+/* Search in the string STRING (with length LENGTH) for the pattern
+   compiled into BUFFER.  Start searching at position START, for RANGE
+   characters.  Return the starting position of the match, -1 for no
+   match, or -2 for an internal error.  Also return register
+   information in REGS (if REGS and BUFFER->no_sub are nonzero).  */
+extern int re_search
+  _RE_ARGS ((struct re_pattern_buffer *buffer, const char *string,
+            int length, int start, int range, struct re_registers *regs));
+
+
+/* Like `re_search', but search in the concatenation of STRING1 and
+   STRING2.  Also, stop searching at index START + STOP.  */
+extern int re_search_2
+  _RE_ARGS ((struct re_pattern_buffer *buffer, const char *string1,
+             int length1, const char *string2, int length2,
+             int start, int range, struct re_registers *regs, int stop));
+
+
+/* Like `re_search', but return how many characters in STRING the regexp
+   in BUFFER matched, starting at position START.  */
+extern int re_match
+  _RE_ARGS ((struct re_pattern_buffer *buffer, const char *string,
+             int length, int start, struct re_registers *regs));
+
+
+/* Relates to `re_match' as `re_search_2' relates to `re_search'.  */
+extern int re_match_2 
+  _RE_ARGS ((struct re_pattern_buffer *buffer, const char *string1,
+             int length1, const char *string2, int length2,
+             int start, struct re_registers *regs, int stop));
+
+
+/* Set REGS to hold NUM_REGS registers, storing them in STARTS and
+   ENDS.  Subsequent matches using BUFFER and REGS will use this memory
+   for recording register information.  STARTS and ENDS must be
+   allocated with malloc, and must each be at least `NUM_REGS * sizeof
+   (regoff_t)' bytes long.
+
+   If NUM_REGS == 0, then subsequent matches should allocate their own
+   register data.
+
+   Unless this function is called, the first search or match using
+   PATTERN_BUFFER will allocate its own register data, without
+   freeing the old data.  */
+extern void re_set_registers
+  _RE_ARGS ((struct re_pattern_buffer *buffer, struct re_registers *regs,
+             unsigned num_regs, regoff_t *starts, regoff_t *ends));
+
+/* 4.2 bsd compatibility.  */
+extern char *re_comp _RE_ARGS ((const char *));
+extern int re_exec _RE_ARGS ((const char *));
+
+/* POSIX compatibility.  */
+extern int regcomp _RE_ARGS ((regex_t *preg, const char *pattern, int cflags));
+extern int regexec
+  _RE_ARGS ((const regex_t *preg, const char *string, size_t nmatch,
+             regmatch_t pmatch[], int eflags));
+extern size_t regerror
+  _RE_ARGS ((int errcode, const regex_t *preg, char *errbuf,
+             size_t errbuf_size));
+extern void regfree _RE_ARGS ((regex_t *preg));
+
+#endif /* not __REGEXP_LIBRARY_H__ */
+
+/*
+Local variables:
+make-backup-files: t
+version-control: t
+trim-versions-without-asking: nil
+End:
+*/
--- a/gnu/usr.bin/grep/search.c
+++ b/gnu/usr.bin/grep/search.c
@ -0,0 +1,481 @@
+/* search.c - searching subroutines using dfa, kwset and regex for grep.
+   Copyright (C) 1992 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, write to the Free Software
+   Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
+
+   Written August 1992 by Mike Haertel. */
+
+#include <ctype.h>
+
+#ifdef STDC_HEADERS
+#include <limits.h>
+#include <stdlib.h>
+#else
+#define UCHAR_MAX 255
+#include <sys/types.h>
+extern char *malloc();
+#endif
+
+#ifdef HAVE_MEMCHR
+#include <string.h>
+#ifdef NEED_MEMORY_H
+#include <memory.h>
+#endif
+#else
+#ifdef __STDC__
+extern void *memchr();
+#else
+extern char *memchr();
+#endif
+#endif
+
+#if defined(HAVE_STRING_H) || defined(STDC_HEADERS)
+#undef bcopy
+#define bcopy(s, d, n) memcpy((d), (s), (n))
+#endif
+
+#ifdef isascii
+#define ISALNUM(C) (isascii(C) && isalnum(C))
+#define ISUPPER(C) (isascii(C) && isupper(C))
+#else
+#define ISALNUM(C) isalnum(C)
+#define ISUPPER(C) isupper(C)
+#endif
+
+#define TOLOWER(C) (ISUPPER(C) ? tolower(C) : (C))
+
+#include "grep.h"
+#include "dfa.h"
+#include "kwset.h"
+#include "regex.h"
+
+#define NCHAR (UCHAR_MAX + 1)
+
+#if __STDC__
+static void Gcompile(char *, size_t);
+static void Ecompile(char *, size_t);
+static char *EGexecute(char *, size_t, char **);
+static void Fcompile(char *, size_t);
+static char *Fexecute(char *, size_t, char **);
+#else
+static void Gcompile();
+static void Ecompile();
+static char *EGexecute();
+static void Fcompile();
+static char *Fexecute();
+#endif
+
+/* Here is the matchers vector for the main program. */
+struct matcher matchers[] = {
+  { "default", Gcompile, EGexecute },
+  { "grep", Gcompile, EGexecute },
+  { "ggrep", Gcompile, EGexecute },
+  { "egrep", Ecompile, EGexecute },
+  { "posix-egrep", Ecompile, EGexecute },
+  { "gegrep", Ecompile, EGexecute },
+  { "fgrep", Fcompile, Fexecute },
+  { "gfgrep", Fcompile, Fexecute },
+  { 0, 0, 0 },
+};
+
+/* For -w, we also consider _ to be word constituent.  */
+#define WCHAR(C) (ISALNUM(C) || (C) == '_')
+
+/* DFA compiled regexp. */
+static struct dfa dfa;
+
+/* Regex compiled regexp. */
+static struct re_pattern_buffer regex;
+
+/* KWset compiled pattern.  For Ecompile and Gcompile, we compile
+   a list of strings, at least one of which is known to occur in
+   any string matching the regexp. */
+static kwset_t kwset;
+
+/* Last compiled fixed string known to exactly match the regexp.
+   If kwsexec() returns < lastexact, then we don't need to
+   call the regexp matcher at all. */
+static int lastexact;
+
+void
+dfaerror(mesg)
+     char *mesg;
+{
+  fatal(mesg, 0);
+}
+
+static void
+kwsinit()
+{
+  static char trans[NCHAR];
+  int i;
+
+  if (match_icase)
+    for (i = 0; i < NCHAR; ++i)
+      trans[i] = TOLOWER(i);
+
+  if (!(kwset = kwsalloc(match_icase ? trans : (char *) 0)))
+    fatal("memory exhausted", 0);
+}  
+
+/* If the DFA turns out to have some set of fixed strings one of
+   which must occur in the match, then we build a kwset matcher
+   to find those strings, and thus quickly filter out impossible
+   matches. */
+static void
+kwsmusts()
+{
+  struct dfamust *dm;
+  char *err;
+
+  if (dfa.musts)
+    {
+      kwsinit();
+      /* First, we compile in the substrings known to be exact
+	 matches.  The kwset matcher will return the index
+	 of the matching string that it chooses. */
+      for (dm = dfa.musts; dm; dm = dm->next)
+	{
+	  if (!dm->exact)
+	    continue;
+	  ++lastexact;
+	  if ((err = kwsincr(kwset, dm->must, strlen(dm->must))) != 0)
+	    fatal(err, 0);
+	}
+      /* Now, we compile the substrings that will require
+	 the use of the regexp matcher.  */
+      for (dm = dfa.musts; dm; dm = dm->next)
+	{
+	  if (dm->exact)
+	    continue;
+	  if ((err = kwsincr(kwset, dm->must, strlen(dm->must))) != 0)
+	    fatal(err, 0);
+	}
+      if ((err = kwsprep(kwset)) != 0)
+	fatal(err, 0);
+    }
+}
+
+static void
+Gcompile(pattern, size)
+     char *pattern;
+     size_t size;
+{
+#ifdef __STDC__
+  const
+#endif
+  char *err;
+
+  re_set_syntax(RE_SYNTAX_GREP | RE_HAT_LISTS_NOT_NEWLINE);
+  dfasyntax(RE_SYNTAX_GREP | RE_HAT_LISTS_NOT_NEWLINE, match_icase);
+
+  if ((err = re_compile_pattern(pattern, size, &regex)) != 0)
+    fatal(err, 0);
+
+  dfainit(&dfa);
+
+  /* In the match_words and match_lines cases, we use a different pattern
+     for the DFA matcher that will quickly throw out cases that won't work.
+     Then if DFA succeeds we do some hairy stuff using the regex matcher
+     to decide whether the match should really count. */
+  if (match_words || match_lines)
+    {
+      /* In the whole-word case, we use the pattern:
+	 (^|[^A-Za-z_])(userpattern)([^A-Za-z_]|$).
+	 In the whole-line case, we use the pattern:
+	 ^(userpattern)$.
+	 BUG: Using [A-Za-z_] is locale-dependent!  */
+
+      char *n = malloc(size + 50);
+      int i = 0;
+
+      strcpy(n, "");
+
+      if (match_lines)
+	strcpy(n, "^\\(");
+      if (match_words)
+	strcpy(n, "\\(^\\|[^0-9A-Za-z_]\\)\\(");
+
+      i = strlen(n);
+      bcopy(pattern, n + i, size);
+      i += size;
+
+      if (match_words)
+	strcpy(n + i, "\\)\\([^0-9A-Za-z_]\\|$\\)");
+      if (match_lines)
+	strcpy(n + i, "\\)$");
+
+      i += strlen(n + i);
+      dfacomp(n, i, &dfa, 1);
+    }
+  else
+    dfacomp(pattern, size, &dfa, 1);
+
+  kwsmusts();
+}
+
+static void
+Ecompile(pattern, size)
+     char *pattern;
+     size_t size;
+{
+#ifdef __STDC__
+  const
+#endif
+  char *err;
+
+  if (strcmp(matcher, "posix-egrep") == 0)
+    {
+      re_set_syntax(RE_SYNTAX_POSIX_EGREP);
+      dfasyntax(RE_SYNTAX_POSIX_EGREP, match_icase);
+    }
+  else
+    {
+      re_set_syntax(RE_SYNTAX_EGREP);
+      dfasyntax(RE_SYNTAX_EGREP, match_icase);
+    }
+
+  if ((err = re_compile_pattern(pattern, size, &regex)) != 0)
+    fatal(err, 0);
+
+  dfainit(&dfa);
+
+  /* In the match_words and match_lines cases, we use a different pattern
+     for the DFA matcher that will quickly throw out cases that won't work.
+     Then if DFA succeeds we do some hairy stuff using the regex matcher
+     to decide whether the match should really count. */
+  if (match_words || match_lines)
+    {
+      /* In the whole-word case, we use the pattern:
+	 (^|[^A-Za-z_])(userpattern)([^A-Za-z_]|$).
+	 In the whole-line case, we use the pattern:
+	 ^(userpattern)$.
+	 BUG: Using [A-Za-z_] is locale-dependent!  */
+
+      char *n = malloc(size + 50);
+      int i = 0;
+
+      strcpy(n, "");
+
+      if (match_lines)
+	strcpy(n, "^(");
+      if (match_words)
+	strcpy(n, "(^|[^0-9A-Za-z_])(");
+
+      i = strlen(n);
+      bcopy(pattern, n + i, size);
+      i += size;
+
+      if (match_words)
+	strcpy(n + i, ")([^0-9A-Za-z_]|$)");
+      if (match_lines)
+	strcpy(n + i, ")$");
+
+      i += strlen(n + i);
+      dfacomp(n, i, &dfa, 1);
+    }
+  else
+    dfacomp(pattern, size, &dfa, 1);
+
+  kwsmusts();
+}
+
+static char *
+EGexecute(buf, size, endp)
+     char *buf;
+     size_t size;
+     char **endp;
+{
+  register char *buflim, *beg, *end, save;
+  int backref, start, len;
+  struct kwsmatch kwsm;
+  static struct re_registers regs; /* This is static on account of a BRAIN-DEAD
+				    Q@#%!# library interface in regex.c.  */
+
+  buflim = buf + size;
+
+  for (beg = end = buf; end < buflim; beg = end + 1)
+    {
+      if (kwset)
+	{
+	  /* Find a possible match using the KWset matcher. */
+	  beg = kwsexec(kwset, beg, buflim - beg, &kwsm);
+	  if (!beg)
+	    goto failure;
+	  /* Narrow down to the line containing the candidate, and
+	     run it through DFA. */
+	  end = memchr(beg, '\n', buflim - beg);
+	  if (!end)
+	    end = buflim;
+	  while (beg > buf && beg[-1] != '\n')
+	    --beg;
+	  save = *end;
+	  if (kwsm.index < lastexact)
+	    goto success;
+	  if (!dfaexec(&dfa, beg, end, 0, (int *) 0, &backref))
+	    {
+	      *end = save;
+	      continue;
+	    }
+	  *end = save;
+	  /* Successful, no backreferences encountered. */
+	  if (!backref)
+	    goto success;
+	}
+      else
+	{
+	  /* No good fixed strings; start with DFA. */
+	  save = *buflim;
+	  beg = dfaexec(&dfa, beg, buflim, 0, (int *) 0, &backref);
+	  *buflim = save;
+	  if (!beg)
+	    goto failure;
+	  /* Narrow down to the line we've found. */
+	  end = memchr(beg, '\n', buflim - beg);
+	  if (!end)
+	    end = buflim;
+	  while (beg > buf && beg[-1] != '\n')
+	    --beg;
+	  /* Successful, no backreferences encountered! */
+	  if (!backref)
+	    goto success;
+	}
+      /* If we've made it to this point, this means DFA has seen
+	 a probable match, and we need to run it through Regex. */
+      regex.not_eol = 0;
+      if ((start = re_search(&regex, beg, end - beg, 0, end - beg, &regs)) >= 0)
+	{
+	  len = regs.end[0] - start;
+	  if (!match_lines && !match_words || match_lines && len == end - beg)
+	    goto success;
+	  /* If -w, check if the match aligns with word boundaries.
+	     We do this iteratively because:
+	     (a) the line may contain more than one occurence of the pattern, and
+	     (b) Several alternatives in the pattern might be valid at a given
+	     point, and we may need to consider a shorter one to find a word
+	     boundary. */
+	  if (match_words)
+	    while (start >= 0)
+	      {
+		if ((start == 0 || !WCHAR(beg[start - 1]))
+		    && (len == end - beg || !WCHAR(beg[start + len])))
+		  goto success;
+		if (len > 0)
+		  {
+		    /* Try a shorter length anchored at the same place. */
+		    --len;
+		    regex.not_eol = 1;
+		    len = re_match(&regex, beg, start + len, start, &regs);
+		  }
+		if (len <= 0)
+		  {
+		    /* Try looking further on. */
+		    if (start == end - beg)
+		      break;
+		    ++start;
+		    regex.not_eol = 0;
+		    start = re_search(&regex, beg, end - beg,
+				      start, end - beg - start, &regs);
+		    len = regs.end[0] - start;
+		  }
+	      }
+	}
+    }
+
+ failure:
+  return 0;
+
+ success:
+  *endp = end < buflim ? end + 1 : end;
+  return beg;
+}
+
+static void
+Fcompile(pattern, size)
+     char *pattern;
+     size_t size;
+{
+  char *beg, *lim, *err;
+
+  kwsinit();
+  beg = pattern;
+  do
+    {
+      for (lim = beg; lim < pattern + size && *lim != '\n'; ++lim)
+	;
+      if ((err = kwsincr(kwset, beg, lim - beg)) != 0)
+	fatal(err, 0);
+      if (lim < pattern + size)
+	++lim;
+      beg = lim;
+    }
+  while (beg < pattern + size);
+
+  if ((err = kwsprep(kwset)) != 0)
+    fatal(err, 0);
+}
+
+static char *
+Fexecute(buf, size, endp)
+     char *buf;
+     size_t size;
+     char **endp;
+{
+  register char *beg, *try, *end;
+  register size_t len;
+  struct kwsmatch kwsmatch;
+
+  for (beg = buf; beg <= buf + size; ++beg)
+    {
+      if (!(beg = kwsexec(kwset, beg, buf + size - beg, &kwsmatch)))
+	return 0;
+      len = kwsmatch.size[0];
+      if (match_lines)
+	{
+	  if (beg > buf && beg[-1] != '\n')
+	    continue;
+	  if (beg + len < buf + size && beg[len] != '\n')
+	    continue;
+	  goto success;
+	}
+      else if (match_words)
+	for (try = beg; len && try;)
+	  {
+	    if (try > buf && WCHAR((unsigned char) try[-1]))
+	      break;
+	    if (try + len < buf + size && WCHAR((unsigned char) try[len]))
+	      {
+		try = kwsexec(kwset, beg, --len, &kwsmatch);
+		len = kwsmatch.size[0];
+	      }
+	    else
+	      goto success;
+	  }
+      else
+	goto success;
+    }
+
+  return 0;
+
+ success:
+  if ((end = memchr(beg + len, '\n', (buf + size) - (beg + len))) != 0)
+    ++end;
+  else
+    end = buf + size;
+  *endp = end;
+  while (beg > buf && beg[-1] != '\n')
+    --beg;
+  return beg;
+}
--- a/gnu/usr.bin/grep/tests/check.sh
+++ b/gnu/usr.bin/grep/tests/check.sh
@ -0,0 +1,24 @@
+#! /bin/sh
+# Regression test for GNU grep.
+# Usage: regress.sh [testdir]
+
+testdir=${1-tests}
+
+failures=0
+
+# The Khadafy test is brought to you by Scott Anderson . . .
+./grep -E -f $testdir/khadafy.regexp $testdir/khadafy.lines > khadafy.out
+if cmp $testdir/khadafy.lines khadafy.out
+then
+	:
+else
+	echo Khadafy test failed -- output left on khadafy.out
+	failures=1
+fi
+
+# . . . and the following by Henry Spencer.
+
+${AWK-awk} -F: -f $testdir/scriptgen.awk $testdir/spencer.tests > tmp.script
+
+sh tmp.script && exit $failures
+exit 1
--- a/gnu/usr.bin/grep/tests/scriptgen.awk
+++ b/gnu/usr.bin/grep/tests/scriptgen.awk
@ -1,6 +1,6 @@
 BEGIN { print "failures=0"; }
-!/^#/ && NF == 3 {
-	print "echo '" $3 "' | $1/egrep -e '" $2 "' > /dev/null 2>&1";
+$0 !~ /^#/ && NF == 3 {
+	print "echo '" $3 "' | ./grep -E -e '" $2 "' > /dev/null 2>&1";
 	print "if [ $? != " $1 " ]"
 	print "then"
 	printf "\techo Spencer test \\#%d failed\n", ++n
--- a/gnu/usr.bin/grep/tests/spencer.tests
+++ b/gnu/usr.bin/grep/tests/spencer.tests
@ -33,7 +33,7 @@
 0:a[b-d]e:ace
 0:a[b-d]:aac
 0:a[-b]:a-
-2:a[b-]:a-
+0:a[b-]:a-
 1:a[b-a]:-
 2:a[]b:-
 2:a[:-