71 lines
3.2 KiB
Plaintext
71 lines
3.2 KiB
Plaintext
This README documents GNU e?grep version 1.6. All bugs reported for
|
|
previous versions have been fixed.
|
|
|
|
See the file INSTALL for compilation and installation instructions.
|
|
|
|
Send bug reports to bug-gnu-utils@prep.ai.mit.edu.
|
|
|
|
GNU e?grep is provided "as is" with no warranty. The exact terms
|
|
under which you may use and (re)distribute this program are detailed
|
|
in the GNU General Public License, in the file COPYING.
|
|
|
|
GNU e?grep is based on a fast lazy-state deterministic matcher (about
|
|
twice as fast as stock Unix egrep) hybridized with a Boyer-Moore-Gosper
|
|
search for a fixed string that eliminates impossible text from being
|
|
considered by the full regexp matcher without necessarily having to
|
|
look at every character. The result is typically many times faster
|
|
than Unix grep or egrep. (Regular expressions containing backreferencing
|
|
may run more slowly, however.)
|
|
|
|
GNU e?grep is brought to you by the efforts of several people:
|
|
|
|
Mike Haertel wrote the deterministic regexp code and the bulk
|
|
of the program.
|
|
|
|
James A. Woods is responsible for the hybridized search strategy
|
|
of using Boyer-Moore-Gosper fixed-string search as a filter
|
|
before calling the general regexp matcher.
|
|
|
|
Arthur David Olson contributed code that finds fixed strings for
|
|
the aforementioned BMG search for a large class of regexps.
|
|
|
|
Richard Stallman wrote the backtracking regexp matcher that is
|
|
used for \<digit> backreferences, as well as the getopt that
|
|
is provided for 4.2BSD sites. The backtracking matcher was
|
|
originally written for GNU Emacs.
|
|
|
|
D. A. Gwyn wrote the C alloca emulation that is provided so
|
|
System V machines can run this program. (Alloca is used only
|
|
by RMS' backtracking matcher, and then only rarely, so there
|
|
is no loss if your machine doesn't have a "real" alloca.)
|
|
|
|
Scott Anderson and Henry Spencer designed the regression tests
|
|
used in the "regress" script.
|
|
|
|
Paul Placeway wrote the manual page, based on this README.
|
|
|
|
If you are interested in improving this program, you may wish to try
|
|
any of the following:
|
|
|
|
1. Replace the fast search loop with a faster search loop.
|
|
There are several things that could be improved, the most notable
|
|
of which would be to calculate a minimal delta2 to use.
|
|
|
|
2. Make backreferencing \<digit> faster. Right now, backreferencing is
|
|
handled by calling the Emacs backtracking matcher to verify the partial
|
|
match. This is slow; if the DFA routines could handle backreferencing
|
|
themselves a speedup on the order of three to four times might occur
|
|
in those cases where the backtracking matcher is called to verify nearly
|
|
every line. Also, some portability problems due to the inclusion of the
|
|
emacs matcher would be solved because it could then be eliminated.
|
|
Note that expressions with backreferencing are not true regular
|
|
expressions, and thus are not equivalent to any DFA. So this is hard.
|
|
|
|
3. Handle POSIX style regexps. I'm not sure if this could be called an
|
|
improvement; some of the things on regexps in the POSIX draft I have
|
|
seen are pretty sickening. But it would be useful in the interests of
|
|
conforming to the standard.
|
|
|
|
4. Replace the main driver program grep.c with the much cleaner main driver
|
|
program used in GNU fgrep.
|