Gnu e?grep 1.6
This commit is contained in:
commit
ecf0e2f6a5
339
gnu/usr.bin/grep/COPYING
Normal file
339
gnu/usr.bin/grep/COPYING
Normal file
@ -0,0 +1,339 @@
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
Version 2, June 1991
|
||||
|
||||
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
|
||||
675 Mass Ave, Cambridge, MA 02139, USA
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
|
||||
Preamble
|
||||
|
||||
The licenses for most software are designed to take away your
|
||||
freedom to share and change it. By contrast, the GNU General Public
|
||||
License is intended to guarantee your freedom to share and change free
|
||||
software--to make sure the software is free for all its users. This
|
||||
General Public License applies to most of the Free Software
|
||||
Foundation's software and to any other program whose authors commit to
|
||||
using it. (Some other Free Software Foundation software is covered by
|
||||
the GNU Library General Public License instead.) You can apply it to
|
||||
your programs, too.
|
||||
|
||||
When we speak of free software, we are referring to freedom, not
|
||||
price. Our General Public Licenses are designed to make sure that you
|
||||
have the freedom to distribute copies of free software (and charge for
|
||||
this service if you wish), that you receive source code or can get it
|
||||
if you want it, that you can change the software or use pieces of it
|
||||
in new free programs; and that you know you can do these things.
|
||||
|
||||
To protect your rights, we need to make restrictions that forbid
|
||||
anyone to deny you these rights or to ask you to surrender the rights.
|
||||
These restrictions translate to certain responsibilities for you if you
|
||||
distribute copies of the software, or if you modify it.
|
||||
|
||||
For example, if you distribute copies of such a program, whether
|
||||
gratis or for a fee, you must give the recipients all the rights that
|
||||
you have. You must make sure that they, too, receive or can get the
|
||||
source code. And you must show them these terms so they know their
|
||||
rights.
|
||||
|
||||
We protect your rights with two steps: (1) copyright the software, and
|
||||
(2) offer you this license which gives you legal permission to copy,
|
||||
distribute and/or modify the software.
|
||||
|
||||
Also, for each author's protection and ours, we want to make certain
|
||||
that everyone understands that there is no warranty for this free
|
||||
software. If the software is modified by someone else and passed on, we
|
||||
want its recipients to know that what they have is not the original, so
|
||||
that any problems introduced by others will not reflect on the original
|
||||
authors' reputations.
|
||||
|
||||
Finally, any free program is threatened constantly by software
|
||||
patents. We wish to avoid the danger that redistributors of a free
|
||||
program will individually obtain patent licenses, in effect making the
|
||||
program proprietary. To prevent this, we have made it clear that any
|
||||
patent must be licensed for everyone's free use or not licensed at all.
|
||||
|
||||
The precise terms and conditions for copying, distribution and
|
||||
modification follow.
|
||||
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
|
||||
|
||||
0. This License applies to any program or other work which contains
|
||||
a notice placed by the copyright holder saying it may be distributed
|
||||
under the terms of this General Public License. The "Program", below,
|
||||
refers to any such program or work, and a "work based on the Program"
|
||||
means either the Program or any derivative work under copyright law:
|
||||
that is to say, a work containing the Program or a portion of it,
|
||||
either verbatim or with modifications and/or translated into another
|
||||
language. (Hereinafter, translation is included without limitation in
|
||||
the term "modification".) Each licensee is addressed as "you".
|
||||
|
||||
Activities other than copying, distribution and modification are not
|
||||
covered by this License; they are outside its scope. The act of
|
||||
running the Program is not restricted, and the output from the Program
|
||||
is covered only if its contents constitute a work based on the
|
||||
Program (independent of having been made by running the Program).
|
||||
Whether that is true depends on what the Program does.
|
||||
|
||||
1. You may copy and distribute verbatim copies of the Program's
|
||||
source code as you receive it, in any medium, provided that you
|
||||
conspicuously and appropriately publish on each copy an appropriate
|
||||
copyright notice and disclaimer of warranty; keep intact all the
|
||||
notices that refer to this License and to the absence of any warranty;
|
||||
and give any other recipients of the Program a copy of this License
|
||||
along with the Program.
|
||||
|
||||
You may charge a fee for the physical act of transferring a copy, and
|
||||
you may at your option offer warranty protection in exchange for a fee.
|
||||
|
||||
2. You may modify your copy or copies of the Program or any portion
|
||||
of it, thus forming a work based on the Program, and copy and
|
||||
distribute such modifications or work under the terms of Section 1
|
||||
above, provided that you also meet all of these conditions:
|
||||
|
||||
a) You must cause the modified files to carry prominent notices
|
||||
stating that you changed the files and the date of any change.
|
||||
|
||||
b) You must cause any work that you distribute or publish, that in
|
||||
whole or in part contains or is derived from the Program or any
|
||||
part thereof, to be licensed as a whole at no charge to all third
|
||||
parties under the terms of this License.
|
||||
|
||||
c) If the modified program normally reads commands interactively
|
||||
when run, you must cause it, when started running for such
|
||||
interactive use in the most ordinary way, to print or display an
|
||||
announcement including an appropriate copyright notice and a
|
||||
notice that there is no warranty (or else, saying that you provide
|
||||
a warranty) and that users may redistribute the program under
|
||||
these conditions, and telling the user how to view a copy of this
|
||||
License. (Exception: if the Program itself is interactive but
|
||||
does not normally print such an announcement, your work based on
|
||||
the Program is not required to print an announcement.)
|
||||
|
||||
These requirements apply to the modified work as a whole. If
|
||||
identifiable sections of that work are not derived from the Program,
|
||||
and can be reasonably considered independent and separate works in
|
||||
themselves, then this License, and its terms, do not apply to those
|
||||
sections when you distribute them as separate works. But when you
|
||||
distribute the same sections as part of a whole which is a work based
|
||||
on the Program, the distribution of the whole must be on the terms of
|
||||
this License, whose permissions for other licensees extend to the
|
||||
entire whole, and thus to each and every part regardless of who wrote it.
|
||||
|
||||
Thus, it is not the intent of this section to claim rights or contest
|
||||
your rights to work written entirely by you; rather, the intent is to
|
||||
exercise the right to control the distribution of derivative or
|
||||
collective works based on the Program.
|
||||
|
||||
In addition, mere aggregation of another work not based on the Program
|
||||
with the Program (or with a work based on the Program) on a volume of
|
||||
a storage or distribution medium does not bring the other work under
|
||||
the scope of this License.
|
||||
|
||||
3. You may copy and distribute the Program (or a work based on it,
|
||||
under Section 2) in object code or executable form under the terms of
|
||||
Sections 1 and 2 above provided that you also do one of the following:
|
||||
|
||||
a) Accompany it with the complete corresponding machine-readable
|
||||
source code, which must be distributed under the terms of Sections
|
||||
1 and 2 above on a medium customarily used for software interchange; or,
|
||||
|
||||
b) Accompany it with a written offer, valid for at least three
|
||||
years, to give any third party, for a charge no more than your
|
||||
cost of physically performing source distribution, a complete
|
||||
machine-readable copy of the corresponding source code, to be
|
||||
distributed under the terms of Sections 1 and 2 above on a medium
|
||||
customarily used for software interchange; or,
|
||||
|
||||
c) Accompany it with the information you received as to the offer
|
||||
to distribute corresponding source code. (This alternative is
|
||||
allowed only for noncommercial distribution and only if you
|
||||
received the program in object code or executable form with such
|
||||
an offer, in accord with Subsection b above.)
|
||||
|
||||
The source code for a work means the preferred form of the work for
|
||||
making modifications to it. For an executable work, complete source
|
||||
code means all the source code for all modules it contains, plus any
|
||||
associated interface definition files, plus the scripts used to
|
||||
control compilation and installation of the executable. However, as a
|
||||
special exception, the source code distributed need not include
|
||||
anything that is normally distributed (in either source or binary
|
||||
form) with the major components (compiler, kernel, and so on) of the
|
||||
operating system on which the executable runs, unless that component
|
||||
itself accompanies the executable.
|
||||
|
||||
If distribution of executable or object code is made by offering
|
||||
access to copy from a designated place, then offering equivalent
|
||||
access to copy the source code from the same place counts as
|
||||
distribution of the source code, even though third parties are not
|
||||
compelled to copy the source along with the object code.
|
||||
|
||||
4. You may not copy, modify, sublicense, or distribute the Program
|
||||
except as expressly provided under this License. Any attempt
|
||||
otherwise to copy, modify, sublicense or distribute the Program is
|
||||
void, and will automatically terminate your rights under this License.
|
||||
However, parties who have received copies, or rights, from you under
|
||||
this License will not have their licenses terminated so long as such
|
||||
parties remain in full compliance.
|
||||
|
||||
5. You are not required to accept this License, since you have not
|
||||
signed it. However, nothing else grants you permission to modify or
|
||||
distribute the Program or its derivative works. These actions are
|
||||
prohibited by law if you do not accept this License. Therefore, by
|
||||
modifying or distributing the Program (or any work based on the
|
||||
Program), you indicate your acceptance of this License to do so, and
|
||||
all its terms and conditions for copying, distributing or modifying
|
||||
the Program or works based on it.
|
||||
|
||||
6. Each time you redistribute the Program (or any work based on the
|
||||
Program), the recipient automatically receives a license from the
|
||||
original licensor to copy, distribute or modify the Program subject to
|
||||
these terms and conditions. You may not impose any further
|
||||
restrictions on the recipients' exercise of the rights granted herein.
|
||||
You are not responsible for enforcing compliance by third parties to
|
||||
this License.
|
||||
|
||||
7. If, as a consequence of a court judgment or allegation of patent
|
||||
infringement or for any other reason (not limited to patent issues),
|
||||
conditions are imposed on you (whether by court order, agreement or
|
||||
otherwise) that contradict the conditions of this License, they do not
|
||||
excuse you from the conditions of this License. If you cannot
|
||||
distribute so as to satisfy simultaneously your obligations under this
|
||||
License and any other pertinent obligations, then as a consequence you
|
||||
may not distribute the Program at all. For example, if a patent
|
||||
license would not permit royalty-free redistribution of the Program by
|
||||
all those who receive copies directly or indirectly through you, then
|
||||
the only way you could satisfy both it and this License would be to
|
||||
refrain entirely from distribution of the Program.
|
||||
|
||||
If any portion of this section is held invalid or unenforceable under
|
||||
any particular circumstance, the balance of the section is intended to
|
||||
apply and the section as a whole is intended to apply in other
|
||||
circumstances.
|
||||
|
||||
It is not the purpose of this section to induce you to infringe any
|
||||
patents or other property right claims or to contest validity of any
|
||||
such claims; this section has the sole purpose of protecting the
|
||||
integrity of the free software distribution system, which is
|
||||
implemented by public license practices. Many people have made
|
||||
generous contributions to the wide range of software distributed
|
||||
through that system in reliance on consistent application of that
|
||||
system; it is up to the author/donor to decide if he or she is willing
|
||||
to distribute software through any other system and a licensee cannot
|
||||
impose that choice.
|
||||
|
||||
This section is intended to make thoroughly clear what is believed to
|
||||
be a consequence of the rest of this License.
|
||||
|
||||
8. If the distribution and/or use of the Program is restricted in
|
||||
certain countries either by patents or by copyrighted interfaces, the
|
||||
original copyright holder who places the Program under this License
|
||||
may add an explicit geographical distribution limitation excluding
|
||||
those countries, so that distribution is permitted only in or among
|
||||
countries not thus excluded. In such case, this License incorporates
|
||||
the limitation as if written in the body of this License.
|
||||
|
||||
9. The Free Software Foundation may publish revised and/or new versions
|
||||
of the General Public License from time to time. Such new versions will
|
||||
be similar in spirit to the present version, but may differ in detail to
|
||||
address new problems or concerns.
|
||||
|
||||
Each version is given a distinguishing version number. If the Program
|
||||
specifies a version number of this License which applies to it and "any
|
||||
later version", you have the option of following the terms and conditions
|
||||
either of that version or of any later version published by the Free
|
||||
Software Foundation. If the Program does not specify a version number of
|
||||
this License, you may choose any version ever published by the Free Software
|
||||
Foundation.
|
||||
|
||||
10. If you wish to incorporate parts of the Program into other free
|
||||
programs whose distribution conditions are different, write to the author
|
||||
to ask for permission. For software which is copyrighted by the Free
|
||||
Software Foundation, write to the Free Software Foundation; we sometimes
|
||||
make exceptions for this. Our decision will be guided by the two goals
|
||||
of preserving the free status of all derivatives of our free software and
|
||||
of promoting the sharing and reuse of software generally.
|
||||
|
||||
NO WARRANTY
|
||||
|
||||
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
|
||||
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
|
||||
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
|
||||
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
|
||||
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
|
||||
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
|
||||
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
|
||||
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
|
||||
REPAIR OR CORRECTION.
|
||||
|
||||
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
|
||||
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
|
||||
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
|
||||
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
|
||||
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
|
||||
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
|
||||
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
|
||||
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
|
||||
POSSIBILITY OF SUCH DAMAGES.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
Appendix: How to Apply These Terms to Your New Programs
|
||||
|
||||
If you develop a new program, and you want it to be of the greatest
|
||||
possible use to the public, the best way to achieve this is to make it
|
||||
free software which everyone can redistribute and change under these terms.
|
||||
|
||||
To do so, attach the following notices to the program. It is safest
|
||||
to attach them to the start of each source file to most effectively
|
||||
convey the exclusion of warranty; and each file should have at least
|
||||
the "copyright" line and a pointer to where the full notice is found.
|
||||
|
||||
<one line to give the program's name and a brief idea of what it does.>
|
||||
Copyright (C) 19yy <name of author>
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2 of the License, or
|
||||
(at your option) any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
Also add information on how to contact you by electronic and paper mail.
|
||||
|
||||
If the program is interactive, make it output a short notice like this
|
||||
when it starts in an interactive mode:
|
||||
|
||||
Gnomovision version 69, Copyright (C) 19yy name of author
|
||||
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
|
||||
This is free software, and you are welcome to redistribute it
|
||||
under certain conditions; type `show c' for details.
|
||||
|
||||
The hypothetical commands `show w' and `show c' should show the appropriate
|
||||
parts of the General Public License. Of course, the commands you use may
|
||||
be called something other than `show w' and `show c'; they could even be
|
||||
mouse-clicks or menu items--whatever suits your program.
|
||||
|
||||
You should also get your employer (if you work as a programmer) or your
|
||||
school, if any, to sign a "copyright disclaimer" for the program, if
|
||||
necessary. Here is a sample; alter the names:
|
||||
|
||||
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
|
||||
`Gnomovision' (which makes passes at compilers) written by James Hacker.
|
||||
|
||||
<signature of Ty Coon>, 1 April 1989
|
||||
Ty Coon, President of Vice
|
||||
|
||||
This General Public License does not permit incorporating your program into
|
||||
proprietary programs. If your program is a subroutine library, you may
|
||||
consider it more useful to permit linking proprietary applications with the
|
||||
library. If this is what you want to do, use the GNU Library General
|
||||
Public License instead of this License.
|
70
gnu/usr.bin/grep/README
Normal file
70
gnu/usr.bin/grep/README
Normal file
@ -0,0 +1,70 @@
|
||||
This README documents GNU e?grep version 1.6. All bugs reported for
|
||||
previous versions have been fixed.
|
||||
|
||||
See the file INSTALL for compilation and installation instructions.
|
||||
|
||||
Send bug reports to bug-gnu-utils@prep.ai.mit.edu.
|
||||
|
||||
GNU e?grep is provided "as is" with no warranty. The exact terms
|
||||
under which you may use and (re)distribute this program are detailed
|
||||
in the GNU General Public License, in the file COPYING.
|
||||
|
||||
GNU e?grep is based on a fast lazy-state deterministic matcher (about
|
||||
twice as fast as stock Unix egrep) hybridized with a Boyer-Moore-Gosper
|
||||
search for a fixed string that eliminates impossible text from being
|
||||
considered by the full regexp matcher without necessarily having to
|
||||
look at every character. The result is typically many times faster
|
||||
than Unix grep or egrep. (Regular expressions containing backreferencing
|
||||
may run more slowly, however.)
|
||||
|
||||
GNU e?grep is brought to you by the efforts of several people:
|
||||
|
||||
Mike Haertel wrote the deterministic regexp code and the bulk
|
||||
of the program.
|
||||
|
||||
James A. Woods is responsible for the hybridized search strategy
|
||||
of using Boyer-Moore-Gosper fixed-string search as a filter
|
||||
before calling the general regexp matcher.
|
||||
|
||||
Arthur David Olson contributed code that finds fixed strings for
|
||||
the aforementioned BMG search for a large class of regexps.
|
||||
|
||||
Richard Stallman wrote the backtracking regexp matcher that is
|
||||
used for \<digit> backreferences, as well as the getopt that
|
||||
is provided for 4.2BSD sites. The backtracking matcher was
|
||||
originally written for GNU Emacs.
|
||||
|
||||
D. A. Gwyn wrote the C alloca emulation that is provided so
|
||||
System V machines can run this program. (Alloca is used only
|
||||
by RMS' backtracking matcher, and then only rarely, so there
|
||||
is no loss if your machine doesn't have a "real" alloca.)
|
||||
|
||||
Scott Anderson and Henry Spencer designed the regression tests
|
||||
used in the "regress" script.
|
||||
|
||||
Paul Placeway wrote the manual page, based on this README.
|
||||
|
||||
If you are interested in improving this program, you may wish to try
|
||||
any of the following:
|
||||
|
||||
1. Replace the fast search loop with a faster search loop.
|
||||
There are several things that could be improved, the most notable
|
||||
of which would be to calculate a minimal delta2 to use.
|
||||
|
||||
2. Make backreferencing \<digit> faster. Right now, backreferencing is
|
||||
handled by calling the Emacs backtracking matcher to verify the partial
|
||||
match. This is slow; if the DFA routines could handle backreferencing
|
||||
themselves a speedup on the order of three to four times might occur
|
||||
in those cases where the backtracking matcher is called to verify nearly
|
||||
every line. Also, some portability problems due to the inclusion of the
|
||||
emacs matcher would be solved because it could then be eliminated.
|
||||
Note that expressions with backreferencing are not true regular
|
||||
expressions, and thus are not equivalent to any DFA. So this is hard.
|
||||
|
||||
3. Handle POSIX style regexps. I'm not sure if this could be called an
|
||||
improvement; some of the things on regexps in the POSIX draft I have
|
||||
seen are pretty sickening. But it would be useful in the interests of
|
||||
conforming to the standard.
|
||||
|
||||
4. Replace the main driver program grep.c with the much cleaner main driver
|
||||
program used in GNU fgrep.
|
2276
gnu/usr.bin/grep/dfa.c
Normal file
2276
gnu/usr.bin/grep/dfa.c
Normal file
File diff suppressed because it is too large
Load Diff
450
gnu/usr.bin/grep/dfa.h
Normal file
450
gnu/usr.bin/grep/dfa.h
Normal file
@ -0,0 +1,450 @@
|
||||
/* dfa.h - declarations for GNU deterministic regexp compiler
|
||||
Copyright (C) 1988 Free Software Foundation, Inc.
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2, or (at your option)
|
||||
any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. */
|
||||
|
||||
/* Written June, 1988 by Mike Haertel */
|
||||
|
||||
#ifdef STDC_HEADERS
|
||||
|
||||
#include <stddef.h>
|
||||
#include <stdlib.h>
|
||||
|
||||
#else /* !STDC_HEADERS */
|
||||
|
||||
#define const
|
||||
#include <sys/types.h> /* For size_t. */
|
||||
extern char *calloc(), *malloc(), *realloc();
|
||||
extern void free();
|
||||
|
||||
#ifndef NULL
|
||||
#define NULL 0
|
||||
#endif
|
||||
|
||||
#endif /* ! STDC_HEADERS */
|
||||
|
||||
#include <ctype.h>
|
||||
#ifndef isascii
|
||||
#define ISALNUM(c) isalnum(c)
|
||||
#define ISALPHA(c) isalpha(c)
|
||||
#define ISUPPER(c) isupper(c)
|
||||
#define ISLOWER(c) islower(c)
|
||||
#else
|
||||
#define ISALNUM(c) (isascii(c) && isalnum(c))
|
||||
#define ISALPHA(c) (isascii(c) && isalpha(c))
|
||||
#define ISUPPER(c) (isascii(c) && isupper(c))
|
||||
#define ISLOWER(c) (isascii(c) && islower(c))
|
||||
#endif
|
||||
|
||||
/* 1 means plain parentheses serve as grouping, and backslash
|
||||
parentheses are needed for literal searching.
|
||||
0 means backslash-parentheses are grouping, and plain parentheses
|
||||
are for literal searching. */
|
||||
#define RE_NO_BK_PARENS 1
|
||||
|
||||
/* 1 means plain | serves as the "or"-operator, and \| is a literal.
|
||||
0 means \| serves as the "or"-operator, and | is a literal. */
|
||||
#define RE_NO_BK_VBAR 2
|
||||
|
||||
/* 0 means plain + or ? serves as an operator, and \+, \? are literals.
|
||||
1 means \+, \? are operators and plain +, ? are literals. */
|
||||
#define RE_BK_PLUS_QM 4
|
||||
|
||||
/* 1 means | binds tighter than ^ or $.
|
||||
0 means the contrary. */
|
||||
#define RE_TIGHT_VBAR 8
|
||||
|
||||
/* 1 means treat \n as an _OR operator
|
||||
0 means treat it as a normal character */
|
||||
#define RE_NEWLINE_OR 16
|
||||
|
||||
/* 0 means that a special characters (such as *, ^, and $) always have
|
||||
their special meaning regardless of the surrounding context.
|
||||
1 means that special characters may act as normal characters in some
|
||||
contexts. Specifically, this applies to:
|
||||
^ - only special at the beginning, or after ( or |
|
||||
$ - only special at the end, or before ) or |
|
||||
*, +, ? - only special when not after the beginning, (, or | */
|
||||
#define RE_CONTEXT_INDEP_OPS 32
|
||||
|
||||
/* Now define combinations of bits for the standard possibilities. */
|
||||
#define RE_SYNTAX_AWK (RE_NO_BK_PARENS | RE_NO_BK_VBAR | RE_CONTEXT_INDEP_OPS)
|
||||
#define RE_SYNTAX_EGREP (RE_SYNTAX_AWK | RE_NEWLINE_OR)
|
||||
#define RE_SYNTAX_GREP (RE_BK_PLUS_QM | RE_NEWLINE_OR)
|
||||
#define RE_SYNTAX_EMACS 0
|
||||
|
||||
/* Number of bits in an unsigned char. */
|
||||
#define CHARBITS 8
|
||||
|
||||
/* First integer value that is greater than any character code. */
|
||||
#define _NOTCHAR (1 << CHARBITS)
|
||||
|
||||
/* INTBITS need not be exact, just a lower bound. */
|
||||
#define INTBITS (CHARBITS * sizeof (int))
|
||||
|
||||
/* Number of ints required to hold a bit for every character. */
|
||||
#define _CHARSET_INTS ((_NOTCHAR + INTBITS - 1) / INTBITS)
|
||||
|
||||
/* Sets of unsigned characters are stored as bit vectors in arrays of ints. */
|
||||
typedef int _charset[_CHARSET_INTS];
|
||||
|
||||
/* The regexp is parsed into an array of tokens in postfix form. Some tokens
|
||||
are operators and others are terminal symbols. Most (but not all) of these
|
||||
codes are returned by the lexical analyzer. */
|
||||
#if __STDC__
|
||||
|
||||
typedef enum
|
||||
{
|
||||
_END = -1, /* _END is a terminal symbol that matches the
|
||||
end of input; any value of _END or less in
|
||||
the parse tree is such a symbol. Accepting
|
||||
states of the DFA are those that would have
|
||||
a transition on _END. */
|
||||
|
||||
/* Ordinary character values are terminal symbols that match themselves. */
|
||||
|
||||
_EMPTY = _NOTCHAR, /* _EMPTY is a terminal symbol that matches
|
||||
the empty string. */
|
||||
|
||||
_BACKREF, /* _BACKREF is generated by \<digit>; it
|
||||
it not completely handled. If the scanner
|
||||
detects a transition on backref, it returns
|
||||
a kind of "semi-success" indicating that
|
||||
the match will have to be verified with
|
||||
a backtracking matcher. */
|
||||
|
||||
_BEGLINE, /* _BEGLINE is a terminal symbol that matches
|
||||
the empty string if it is at the beginning
|
||||
of a line. */
|
||||
|
||||
_ALLBEGLINE, /* _ALLBEGLINE is a terminal symbol that
|
||||
matches the empty string if it is at the
|
||||
beginning of a line; _ALLBEGLINE applies
|
||||
to the entire regexp and can only occur
|
||||
as the first token thereof. _ALLBEGLINE
|
||||
never appears in the parse tree; a _BEGLINE
|
||||
is prepended with _CAT to the entire
|
||||
regexp instead. */
|
||||
|
||||
_ENDLINE, /* _ENDLINE is a terminal symbol that matches
|
||||
the empty string if it is at the end of
|
||||
a line. */
|
||||
|
||||
_ALLENDLINE, /* _ALLENDLINE is to _ENDLINE as _ALLBEGLINE
|
||||
is to _BEGLINE. */
|
||||
|
||||
_BEGWORD, /* _BEGWORD is a terminal symbol that matches
|
||||
the empty string if it is at the beginning
|
||||
of a word. */
|
||||
|
||||
_ENDWORD, /* _ENDWORD is a terminal symbol that matches
|
||||
the empty string if it is at the end of
|
||||
a word. */
|
||||
|
||||
_LIMWORD, /* _LIMWORD is a terminal symbol that matches
|
||||
the empty string if it is at the beginning
|
||||
or the end of a word. */
|
||||
|
||||
_NOTLIMWORD, /* _NOTLIMWORD is a terminal symbol that
|
||||
matches the empty string if it is not at
|
||||
the beginning or end of a word. */
|
||||
|
||||
_QMARK, /* _QMARK is an operator of one argument that
|
||||
matches zero or one occurences of its
|
||||
argument. */
|
||||
|
||||
_STAR, /* _STAR is an operator of one argument that
|
||||
matches the Kleene closure (zero or more
|
||||
occurrences) of its argument. */
|
||||
|
||||
_PLUS, /* _PLUS is an operator of one argument that
|
||||
matches the positive closure (one or more
|
||||
occurrences) of its argument. */
|
||||
|
||||
_CAT, /* _CAT is an operator of two arguments that
|
||||
matches the concatenation of its
|
||||
arguments. _CAT is never returned by the
|
||||
lexical analyzer. */
|
||||
|
||||
_OR, /* _OR is an operator of two arguments that
|
||||
matches either of its arguments. */
|
||||
|
||||
_LPAREN, /* _LPAREN never appears in the parse tree,
|
||||
it is only a lexeme. */
|
||||
|
||||
_RPAREN, /* _RPAREN never appears in the parse tree. */
|
||||
|
||||
_SET /* _SET and (and any value greater) is a
|
||||
terminal symbol that matches any of a
|
||||
class of characters. */
|
||||
} _token;
|
||||
|
||||
#else /* ! __STDC__ */
|
||||
|
||||
typedef short _token;
|
||||
|
||||
#define _END -1
|
||||
#define _EMPTY _NOTCHAR
|
||||
#define _BACKREF (_EMPTY + 1)
|
||||
#define _BEGLINE (_EMPTY + 2)
|
||||
#define _ALLBEGLINE (_EMPTY + 3)
|
||||
#define _ENDLINE (_EMPTY + 4)
|
||||
#define _ALLENDLINE (_EMPTY + 5)
|
||||
#define _BEGWORD (_EMPTY + 6)
|
||||
#define _ENDWORD (_EMPTY + 7)
|
||||
#define _LIMWORD (_EMPTY + 8)
|
||||
#define _NOTLIMWORD (_EMPTY + 9)
|
||||
#define _QMARK (_EMPTY + 10)
|
||||
#define _STAR (_EMPTY + 11)
|
||||
#define _PLUS (_EMPTY + 12)
|
||||
#define _CAT (_EMPTY + 13)
|
||||
#define _OR (_EMPTY + 14)
|
||||
#define _LPAREN (_EMPTY + 15)
|
||||
#define _RPAREN (_EMPTY + 16)
|
||||
#define _SET (_EMPTY + 17)
|
||||
|
||||
#endif /* ! __STDC__ */
|
||||
|
||||
/* Sets are stored in an array in the compiled regexp; the index of the
|
||||
array corresponding to a given set token is given by _SET_INDEX(t). */
|
||||
#define _SET_INDEX(t) ((t) - _SET)
|
||||
|
||||
/* Sometimes characters can only be matched depending on the surrounding
|
||||
context. Such context decisions depend on what the previous character
|
||||
was, and the value of the current (lookahead) character. Context
|
||||
dependent constraints are encoded as 8 bit integers. Each bit that
|
||||
is set indicates that the constraint succeeds in the corresponding
|
||||
context.
|
||||
|
||||
bit 7 - previous and current are newlines
|
||||
bit 6 - previous was newline, current isn't
|
||||
bit 5 - previous wasn't newline, current is
|
||||
bit 4 - neither previous nor current is a newline
|
||||
bit 3 - previous and current are word-constituents
|
||||
bit 2 - previous was word-constituent, current isn't
|
||||
bit 1 - previous wasn't word-constituent, current is
|
||||
bit 0 - neither previous nor current is word-constituent
|
||||
|
||||
Word-constituent characters are those that satisfy isalnum().
|
||||
|
||||
The macro _SUCCEEDS_IN_CONTEXT determines whether a a given constraint
|
||||
succeeds in a particular context. Prevn is true if the previous character
|
||||
was a newline, currn is true if the lookahead character is a newline.
|
||||
Prevl and currl similarly depend upon whether the previous and current
|
||||
characters are word-constituent letters. */
|
||||
#define _MATCHES_NEWLINE_CONTEXT(constraint, prevn, currn) \
|
||||
((constraint) & 1 << ((prevn) ? 2 : 0) + ((currn) ? 1 : 0) + 4)
|
||||
#define _MATCHES_LETTER_CONTEXT(constraint, prevl, currl) \
|
||||
((constraint) & 1 << ((prevl) ? 2 : 0) + ((currl) ? 1 : 0))
|
||||
#define _SUCCEEDS_IN_CONTEXT(constraint, prevn, currn, prevl, currl) \
|
||||
(_MATCHES_NEWLINE_CONTEXT(constraint, prevn, currn) \
|
||||
&& _MATCHES_LETTER_CONTEXT(constraint, prevl, currl))
|
||||
|
||||
/* The following macros give information about what a constraint depends on. */
|
||||
#define _PREV_NEWLINE_DEPENDENT(constraint) \
|
||||
(((constraint) & 0xc0) >> 2 != ((constraint) & 0x30))
|
||||
#define _PREV_LETTER_DEPENDENT(constraint) \
|
||||
(((constraint) & 0x0c) >> 2 != ((constraint) & 0x03))
|
||||
|
||||
/* Tokens that match the empty string subject to some constraint actually
|
||||
work by applying that constraint to determine what may follow them,
|
||||
taking into account what has gone before. The following values are
|
||||
the constraints corresponding to the special tokens previously defined. */
|
||||
#define _NO_CONSTRAINT 0xff
|
||||
#define _BEGLINE_CONSTRAINT 0xcf
|
||||
#define _ENDLINE_CONSTRAINT 0xaf
|
||||
#define _BEGWORD_CONSTRAINT 0xf2
|
||||
#define _ENDWORD_CONSTRAINT 0xf4
|
||||
#define _LIMWORD_CONSTRAINT 0xf6
|
||||
#define _NOTLIMWORD_CONSTRAINT 0xf9
|
||||
|
||||
/* States of the recognizer correspond to sets of positions in the parse
|
||||
tree, together with the constraints under which they may be matched.
|
||||
So a position is encoded as an index into the parse tree together with
|
||||
a constraint. */
|
||||
typedef struct
|
||||
{
|
||||
unsigned index; /* Index into the parse array. */
|
||||
unsigned constraint; /* Constraint for matching this position. */
|
||||
} _position;
|
||||
|
||||
/* Sets of positions are stored as arrays. */
|
||||
typedef struct
|
||||
{
|
||||
_position *elems; /* Elements of this position set. */
|
||||
int nelem; /* Number of elements in this set. */
|
||||
} _position_set;
|
||||
|
||||
/* A state of the regexp consists of a set of positions, some flags,
|
||||
and the token value of the lowest-numbered position of the state that
|
||||
contains an _END token. */
|
||||
typedef struct
|
||||
{
|
||||
int hash; /* Hash of the positions of this state. */
|
||||
_position_set elems; /* Positions this state could match. */
|
||||
char newline; /* True if previous state matched newline. */
|
||||
char letter; /* True if previous state matched a letter. */
|
||||
char backref; /* True if this state matches a \<digit>. */
|
||||
unsigned char constraint; /* Constraint for this state to accept. */
|
||||
int first_end; /* Token value of the first _END in elems. */
|
||||
} _dfa_state;
|
||||
|
||||
/* If an r.e. is at most MUST_MAX characters long, we look for a string which
|
||||
must appear in it; whatever's found is dropped into the struct reg. */
|
||||
|
||||
#define MUST_MAX 50
|
||||
|
||||
/* A compiled regular expression. */
|
||||
struct regexp
|
||||
{
|
||||
/* Stuff built by the scanner. */
|
||||
_charset *charsets; /* Array of character sets for _SET tokens. */
|
||||
int cindex; /* Index for adding new charsets. */
|
||||
int calloc; /* Number of charsets currently allocated. */
|
||||
|
||||
/* Stuff built by the parser. */
|
||||
_token *tokens; /* Postfix parse array. */
|
||||
int tindex; /* Index for adding new tokens. */
|
||||
int talloc; /* Number of tokens currently allocated. */
|
||||
int depth; /* Depth required of an evaluation stack
|
||||
used for depth-first traversal of the
|
||||
parse tree. */
|
||||
int nleaves; /* Number of leaves on the parse tree. */
|
||||
int nregexps; /* Count of parallel regexps being built
|
||||
with regparse(). */
|
||||
|
||||
/* Stuff owned by the state builder. */
|
||||
_dfa_state *states; /* States of the regexp. */
|
||||
int sindex; /* Index for adding new states. */
|
||||
int salloc; /* Number of states currently allocated. */
|
||||
|
||||
/* Stuff built by the structure analyzer. */
|
||||
_position_set *follows; /* Array of follow sets, indexed by position
|
||||
index. The follow of a position is the set
|
||||
of positions containing characters that
|
||||
could conceivably follow a character
|
||||
matching the given position in a string
|
||||
matching the regexp. Allocated to the
|
||||
maximum possible position index. */
|
||||
int searchflag; /* True if we are supposed to build a searching
|
||||
as opposed to an exact matcher. A searching
|
||||
matcher finds the first and shortest string
|
||||
matching a regexp anywhere in the buffer,
|
||||
whereas an exact matcher finds the longest
|
||||
string matching, but anchored to the
|
||||
beginning of the buffer. */
|
||||
|
||||
/* Stuff owned by the executor. */
|
||||
int tralloc; /* Number of transition tables that have
|
||||
slots so far. */
|
||||
int trcount; /* Number of transition tables that have
|
||||
actually been built. */
|
||||
int **trans; /* Transition tables for states that can
|
||||
never accept. If the transitions for a
|
||||
state have not yet been computed, or the
|
||||
state could possibly accept, its entry in
|
||||
this table is NULL. */
|
||||
int **realtrans; /* Trans always points to realtrans + 1; this
|
||||
is so trans[-1] can contain NULL. */
|
||||
int **fails; /* Transition tables after failing to accept
|
||||
on a state that potentially could do so. */
|
||||
int *success; /* Table of acceptance conditions used in
|
||||
regexecute and computed in build_state. */
|
||||
int *newlines; /* Transitions on newlines. The entry for a
|
||||
newline in any transition table is always
|
||||
-1 so we can count lines without wasting
|
||||
too many cycles. The transition for a
|
||||
newline is stored separately and handled
|
||||
as a special case. Newline is also used
|
||||
as a sentinel at the end of the buffer. */
|
||||
char must[MUST_MAX];
|
||||
int mustn;
|
||||
};
|
||||
|
||||
/* Some macros for user access to regexp internals. */
|
||||
|
||||
/* ACCEPTING returns true if s could possibly be an accepting state of r. */
|
||||
#define ACCEPTING(s, r) ((r).states[s].constraint)
|
||||
|
||||
/* ACCEPTS_IN_CONTEXT returns true if the given state accepts in the
|
||||
specified context. */
|
||||
#define ACCEPTS_IN_CONTEXT(prevn, currn, prevl, currl, state, reg) \
|
||||
_SUCCEEDS_IN_CONTEXT((reg).states[state].constraint, \
|
||||
prevn, currn, prevl, currl)
|
||||
|
||||
/* FIRST_MATCHING_REGEXP returns the index number of the first of parallel
|
||||
regexps that a given state could accept. Parallel regexps are numbered
|
||||
starting at 1. */
|
||||
#define FIRST_MATCHING_REGEXP(state, reg) (-(reg).states[state].first_end)
|
||||
|
||||
/* Entry points. */
|
||||
|
||||
#if __STDC__
|
||||
|
||||
/* Regsyntax() takes two arguments; the first sets the syntax bits described
|
||||
earlier in this file, and the second sets the case-folding flag. */
|
||||
extern void regsyntax(int, int);
|
||||
|
||||
/* Compile the given string of the given length into the given struct regexp.
|
||||
Final argument is a flag specifying whether to build a searching or an
|
||||
exact matcher. */
|
||||
extern void regcompile(const char *, size_t, struct regexp *, int);
|
||||
|
||||
/* Execute the given struct regexp on the buffer of characters. The
|
||||
first char * points to the beginning, and the second points to the
|
||||
first character after the end of the buffer, which must be a writable
|
||||
place so a sentinel end-of-buffer marker can be stored there. The
|
||||
second-to-last argument is a flag telling whether to allow newlines to
|
||||
be part of a string matching the regexp. The next-to-last argument,
|
||||
if non-NULL, points to a place to increment every time we see a
|
||||
newline. The final argument, if non-NULL, points to a flag that will
|
||||
be set if further examination by a backtracking matcher is needed in
|
||||
order to verify backreferencing; otherwise the flag will be cleared.
|
||||
Returns NULL if no match is found, or a pointer to the first
|
||||
character after the first & shortest matching string in the buffer. */
|
||||
extern char *regexecute(struct regexp *, char *, char *, int, int *, int *);
|
||||
|
||||
/* Free the storage held by the components of a struct regexp. */
|
||||
extern void regfree(struct regexp *);
|
||||
|
||||
/* Entry points for people who know what they're doing. */
|
||||
|
||||
/* Initialize the components of a struct regexp. */
|
||||
extern void reginit(struct regexp *);
|
||||
|
||||
/* Incrementally parse a string of given length into a struct regexp. */
|
||||
extern void regparse(const char *, size_t, struct regexp *);
|
||||
|
||||
/* Analyze a parsed regexp; second argument tells whether to build a searching
|
||||
or an exact matcher. */
|
||||
extern void reganalyze(struct regexp *, int);
|
||||
|
||||
/* Compute, for each possible character, the transitions out of a given
|
||||
state, storing them in an array of integers. */
|
||||
extern void regstate(int, struct regexp *, int []);
|
||||
|
||||
/* Error handling. */
|
||||
|
||||
/* Regerror() is called by the regexp routines whenever an error occurs. It
|
||||
takes a single argument, a NUL-terminated string describing the error.
|
||||
The default regerror() prints the error message to stderr and exits.
|
||||
The user can provide a different regfree() if so desired. */
|
||||
extern void regerror(const char *);
|
||||
|
||||
#else /* ! __STDC__ */
|
||||
extern void regsyntax(), regcompile(), regfree(), reginit(), regparse();
|
||||
extern void reganalyze(), regstate(), regerror();
|
||||
extern char *regexecute();
|
||||
#endif /* ! __STDC__ */
|
678
gnu/usr.bin/grep/getopt.c
Normal file
678
gnu/usr.bin/grep/getopt.c
Normal file
@ -0,0 +1,678 @@
|
||||
/* Getopt for GNU.
|
||||
NOTE: getopt is now part of the C library, so if you don't know what
|
||||
"Keep this file name-space clean" means, talk to roland@gnu.ai.mit.edu
|
||||
before changing it!
|
||||
|
||||
Copyright (C) 1987, 88, 89, 90, 91, 1992 Free Software Foundation, Inc.
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2, or (at your option)
|
||||
any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. */
|
||||
|
||||
/* AIX requires this to be the first thing in the file. */
|
||||
#ifdef __GNUC__
|
||||
#define alloca __builtin_alloca
|
||||
#else /* not __GNUC__ */
|
||||
#if defined(sparc) && !defined(USG) && !defined(SVR4) && !defined(__svr4__)
|
||||
#include <alloca.h>
|
||||
#else
|
||||
#ifdef _AIX
|
||||
#pragma alloca
|
||||
#else
|
||||
char *alloca ();
|
||||
#endif
|
||||
#endif /* sparc */
|
||||
#endif /* not __GNUC__ */
|
||||
|
||||
#ifdef LIBC
|
||||
/* For when compiled as part of the GNU C library. */
|
||||
#include <ansidecl.h>
|
||||
#endif
|
||||
|
||||
#include <stdio.h>
|
||||
|
||||
/* This needs to come after some library #include
|
||||
to get __GNU_LIBRARY__ defined. */
|
||||
#ifdef __GNU_LIBRARY__
|
||||
#undef alloca
|
||||
#include <stdlib.h>
|
||||
#include <string.h>
|
||||
#else /* Not GNU C library. */
|
||||
#define __alloca alloca
|
||||
#endif /* GNU C library. */
|
||||
|
||||
|
||||
#ifndef __STDC__
|
||||
#define const
|
||||
#endif
|
||||
|
||||
/* If GETOPT_COMPAT is defined, `+' as well as `--' can introduce a
|
||||
long-named option. Because this is not POSIX.2 compliant, it is
|
||||
being phased out. */
|
||||
#define GETOPT_COMPAT
|
||||
|
||||
/* This version of `getopt' appears to the caller like standard Unix `getopt'
|
||||
but it behaves differently for the user, since it allows the user
|
||||
to intersperse the options with the other arguments.
|
||||
|
||||
As `getopt' works, it permutes the elements of ARGV so that,
|
||||
when it is done, all the options precede everything else. Thus
|
||||
all application programs are extended to handle flexible argument order.
|
||||
|
||||
Setting the environment variable POSIXLY_CORRECT disables permutation.
|
||||
Then the behavior is completely standard.
|
||||
|
||||
GNU application programs can use a third alternative mode in which
|
||||
they can distinguish the relative order of options and other arguments. */
|
||||
|
||||
#include "getopt.h"
|
||||
|
||||
/* For communication from `getopt' to the caller.
|
||||
When `getopt' finds an option that takes an argument,
|
||||
the argument value is returned here.
|
||||
Also, when `ordering' is RETURN_IN_ORDER,
|
||||
each non-option ARGV-element is returned here. */
|
||||
|
||||
char *optarg = 0;
|
||||
|
||||
/* Index in ARGV of the next element to be scanned.
|
||||
This is used for communication to and from the caller
|
||||
and for communication between successive calls to `getopt'.
|
||||
|
||||
On entry to `getopt', zero means this is the first call; initialize.
|
||||
|
||||
When `getopt' returns EOF, this is the index of the first of the
|
||||
non-option elements that the caller should itself scan.
|
||||
|
||||
Otherwise, `optind' communicates from one call to the next
|
||||
how much of ARGV has been scanned so far. */
|
||||
|
||||
int optind = 0;
|
||||
|
||||
/* The next char to be scanned in the option-element
|
||||
in which the last option character we returned was found.
|
||||
This allows us to pick up the scan where we left off.
|
||||
|
||||
If this is zero, or a null string, it means resume the scan
|
||||
by advancing to the next ARGV-element. */
|
||||
|
||||
static char *nextchar;
|
||||
|
||||
/* Callers store zero here to inhibit the error message
|
||||
for unrecognized options. */
|
||||
|
||||
int opterr = 1;
|
||||
|
||||
/* Describe how to deal with options that follow non-option ARGV-elements.
|
||||
|
||||
If the caller did not specify anything,
|
||||
the default is REQUIRE_ORDER if the environment variable
|
||||
POSIXLY_CORRECT is defined, PERMUTE otherwise.
|
||||
|
||||
REQUIRE_ORDER means don't recognize them as options;
|
||||
stop option processing when the first non-option is seen.
|
||||
This is what Unix does.
|
||||
This mode of operation is selected by either setting the environment
|
||||
variable POSIXLY_CORRECT, or using `+' as the first character
|
||||
of the list of option characters.
|
||||
|
||||
PERMUTE is the default. We permute the contents of ARGV as we scan,
|
||||
so that eventually all the non-options are at the end. This allows options
|
||||
to be given in any order, even with programs that were not written to
|
||||
expect this.
|
||||
|
||||
RETURN_IN_ORDER is an option available to programs that were written
|
||||
to expect options and other ARGV-elements in any order and that care about
|
||||
the ordering of the two. We describe each non-option ARGV-element
|
||||
as if it were the argument of an option with character code 1.
|
||||
Using `-' as the first character of the list of option characters
|
||||
selects this mode of operation.
|
||||
|
||||
The special argument `--' forces an end of option-scanning regardless
|
||||
of the value of `ordering'. In the case of RETURN_IN_ORDER, only
|
||||
`--' can cause `getopt' to return EOF with `optind' != ARGC. */
|
||||
|
||||
static enum
|
||||
{
|
||||
REQUIRE_ORDER, PERMUTE, RETURN_IN_ORDER
|
||||
} ordering;
|
||||
|
||||
#ifdef __GNU_LIBRARY__
|
||||
#include <string.h>
|
||||
#define my_index strchr
|
||||
#define my_bcopy(src, dst, n) memcpy ((dst), (src), (n))
|
||||
#else
|
||||
|
||||
/* Avoid depending on library functions or files
|
||||
whose names are inconsistent. */
|
||||
|
||||
char *getenv ();
|
||||
|
||||
static char *
|
||||
my_index (string, chr)
|
||||
char *string;
|
||||
int chr;
|
||||
{
|
||||
while (*string)
|
||||
{
|
||||
if (*string == chr)
|
||||
return string;
|
||||
string++;
|
||||
}
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void
|
||||
my_bcopy (from, to, size)
|
||||
char *from, *to;
|
||||
int size;
|
||||
{
|
||||
int i;
|
||||
for (i = 0; i < size; i++)
|
||||
to[i] = from[i];
|
||||
}
|
||||
#endif /* GNU C library. */
|
||||
|
||||
/* Handle permutation of arguments. */
|
||||
|
||||
/* Describe the part of ARGV that contains non-options that have
|
||||
been skipped. `first_nonopt' is the index in ARGV of the first of them;
|
||||
`last_nonopt' is the index after the last of them. */
|
||||
|
||||
static int first_nonopt;
|
||||
static int last_nonopt;
|
||||
|
||||
/* Exchange two adjacent subsequences of ARGV.
|
||||
One subsequence is elements [first_nonopt,last_nonopt)
|
||||
which contains all the non-options that have been skipped so far.
|
||||
The other is elements [last_nonopt,optind), which contains all
|
||||
the options processed since those non-options were skipped.
|
||||
|
||||
`first_nonopt' and `last_nonopt' are relocated so that they describe
|
||||
the new indices of the non-options in ARGV after they are moved. */
|
||||
|
||||
static void
|
||||
exchange (argv)
|
||||
char **argv;
|
||||
{
|
||||
int nonopts_size = (last_nonopt - first_nonopt) * sizeof (char *);
|
||||
char **temp = (char **) __alloca (nonopts_size);
|
||||
|
||||
/* Interchange the two blocks of data in ARGV. */
|
||||
|
||||
my_bcopy (&argv[first_nonopt], temp, nonopts_size);
|
||||
my_bcopy (&argv[last_nonopt], &argv[first_nonopt],
|
||||
(optind - last_nonopt) * sizeof (char *));
|
||||
my_bcopy (temp, &argv[first_nonopt + optind - last_nonopt], nonopts_size);
|
||||
|
||||
/* Update records for the slots the non-options now occupy. */
|
||||
|
||||
first_nonopt += (optind - last_nonopt);
|
||||
last_nonopt = optind;
|
||||
}
|
||||
|
||||
/* Scan elements of ARGV (whose length is ARGC) for option characters
|
||||
given in OPTSTRING.
|
||||
|
||||
If an element of ARGV starts with '-', and is not exactly "-" or "--",
|
||||
then it is an option element. The characters of this element
|
||||
(aside from the initial '-') are option characters. If `getopt'
|
||||
is called repeatedly, it returns successively each of the option characters
|
||||
from each of the option elements.
|
||||
|
||||
If `getopt' finds another option character, it returns that character,
|
||||
updating `optind' and `nextchar' so that the next call to `getopt' can
|
||||
resume the scan with the following option character or ARGV-element.
|
||||
|
||||
If there are no more option characters, `getopt' returns `EOF'.
|
||||
Then `optind' is the index in ARGV of the first ARGV-element
|
||||
that is not an option. (The ARGV-elements have been permuted
|
||||
so that those that are not options now come last.)
|
||||
|
||||
OPTSTRING is a string containing the legitimate option characters.
|
||||
If an option character is seen that is not listed in OPTSTRING,
|
||||
return '?' after printing an error message. If you set `opterr' to
|
||||
zero, the error message is suppressed but we still return '?'.
|
||||
|
||||
If a char in OPTSTRING is followed by a colon, that means it wants an arg,
|
||||
so the following text in the same ARGV-element, or the text of the following
|
||||
ARGV-element, is returned in `optarg'. Two colons mean an option that
|
||||
wants an optional arg; if there is text in the current ARGV-element,
|
||||
it is returned in `optarg', otherwise `optarg' is set to zero.
|
||||
|
||||
If OPTSTRING starts with `-' or `+', it requests different methods of
|
||||
handling the non-option ARGV-elements.
|
||||
See the comments about RETURN_IN_ORDER and REQUIRE_ORDER, above.
|
||||
|
||||
Long-named options begin with `--' instead of `-'.
|
||||
Their names may be abbreviated as long as the abbreviation is unique
|
||||
or is an exact match for some defined option. If they have an
|
||||
argument, it follows the option name in the same ARGV-element, separated
|
||||
from the option name by a `=', or else the in next ARGV-element.
|
||||
When `getopt' finds a long-named option, it returns 0 if that option's
|
||||
`flag' field is nonzero, the value of the option's `val' field
|
||||
if the `flag' field is zero.
|
||||
|
||||
The elements of ARGV aren't really const, because we permute them.
|
||||
But we pretend they're const in the prototype to be compatible
|
||||
with other systems.
|
||||
|
||||
LONGOPTS is a vector of `struct option' terminated by an
|
||||
element containing a name which is zero.
|
||||
|
||||
LONGIND returns the index in LONGOPT of the long-named option found.
|
||||
It is only valid when a long-named option has been found by the most
|
||||
recent call.
|
||||
|
||||
If LONG_ONLY is nonzero, '-' as well as '--' can introduce
|
||||
long-named options. */
|
||||
|
||||
int
|
||||
_getopt_internal (argc, argv, optstring, longopts, longind, long_only)
|
||||
int argc;
|
||||
char *const *argv;
|
||||
const char *optstring;
|
||||
const struct option *longopts;
|
||||
int *longind;
|
||||
int long_only;
|
||||
{
|
||||
int option_index;
|
||||
|
||||
optarg = 0;
|
||||
|
||||
/* Initialize the internal data when the first call is made.
|
||||
Start processing options with ARGV-element 1 (since ARGV-element 0
|
||||
is the program name); the sequence of previously skipped
|
||||
non-option ARGV-elements is empty. */
|
||||
|
||||
if (optind == 0)
|
||||
{
|
||||
first_nonopt = last_nonopt = optind = 1;
|
||||
|
||||
nextchar = NULL;
|
||||
|
||||
/* Determine how to handle the ordering of options and nonoptions. */
|
||||
|
||||
if (optstring[0] == '-')
|
||||
{
|
||||
ordering = RETURN_IN_ORDER;
|
||||
++optstring;
|
||||
}
|
||||
else if (optstring[0] == '+')
|
||||
{
|
||||
ordering = REQUIRE_ORDER;
|
||||
++optstring;
|
||||
}
|
||||
else if (getenv ("POSIXLY_CORRECT") != NULL)
|
||||
ordering = REQUIRE_ORDER;
|
||||
else
|
||||
ordering = PERMUTE;
|
||||
}
|
||||
|
||||
if (nextchar == NULL || *nextchar == '\0')
|
||||
{
|
||||
if (ordering == PERMUTE)
|
||||
{
|
||||
/* If we have just processed some options following some non-options,
|
||||
exchange them so that the options come first. */
|
||||
|
||||
if (first_nonopt != last_nonopt && last_nonopt != optind)
|
||||
exchange ((char **) argv);
|
||||
else if (last_nonopt != optind)
|
||||
first_nonopt = optind;
|
||||
|
||||
/* Now skip any additional non-options
|
||||
and extend the range of non-options previously skipped. */
|
||||
|
||||
while (optind < argc
|
||||
&& (argv[optind][0] != '-' || argv[optind][1] == '\0')
|
||||
#ifdef GETOPT_COMPAT
|
||||
&& (longopts == NULL
|
||||
|| argv[optind][0] != '+' || argv[optind][1] == '\0')
|
||||
#endif /* GETOPT_COMPAT */
|
||||
)
|
||||
optind++;
|
||||
last_nonopt = optind;
|
||||
}
|
||||
|
||||
/* Special ARGV-element `--' means premature end of options.
|
||||
Skip it like a null option,
|
||||
then exchange with previous non-options as if it were an option,
|
||||
then skip everything else like a non-option. */
|
||||
|
||||
if (optind != argc && !strcmp (argv[optind], "--"))
|
||||
{
|
||||
optind++;
|
||||
|
||||
if (first_nonopt != last_nonopt && last_nonopt != optind)
|
||||
exchange ((char **) argv);
|
||||
else if (first_nonopt == last_nonopt)
|
||||
first_nonopt = optind;
|
||||
last_nonopt = argc;
|
||||
|
||||
optind = argc;
|
||||
}
|
||||
|
||||
/* If we have done all the ARGV-elements, stop the scan
|
||||
and back over any non-options that we skipped and permuted. */
|
||||
|
||||
if (optind == argc)
|
||||
{
|
||||
/* Set the next-arg-index to point at the non-options
|
||||
that we previously skipped, so the caller will digest them. */
|
||||
if (first_nonopt != last_nonopt)
|
||||
optind = first_nonopt;
|
||||
return EOF;
|
||||
}
|
||||
|
||||
/* If we have come to a non-option and did not permute it,
|
||||
either stop the scan or describe it to the caller and pass it by. */
|
||||
|
||||
if ((argv[optind][0] != '-' || argv[optind][1] == '\0')
|
||||
#ifdef GETOPT_COMPAT
|
||||
&& (longopts == NULL
|
||||
|| argv[optind][0] != '+' || argv[optind][1] == '\0')
|
||||
#endif /* GETOPT_COMPAT */
|
||||
)
|
||||
{
|
||||
if (ordering == REQUIRE_ORDER)
|
||||
return EOF;
|
||||
optarg = argv[optind++];
|
||||
return 1;
|
||||
}
|
||||
|
||||
/* We have found another option-ARGV-element.
|
||||
Start decoding its characters. */
|
||||
|
||||
nextchar = (argv[optind] + 1
|
||||
+ (longopts != NULL && argv[optind][1] == '-'));
|
||||
}
|
||||
|
||||
if (longopts != NULL
|
||||
&& ((argv[optind][0] == '-'
|
||||
&& (argv[optind][1] == '-' || long_only))
|
||||
#ifdef GETOPT_COMPAT
|
||||
|| argv[optind][0] == '+'
|
||||
#endif /* GETOPT_COMPAT */
|
||||
))
|
||||
{
|
||||
const struct option *p;
|
||||
char *s = nextchar;
|
||||
int exact = 0;
|
||||
int ambig = 0;
|
||||
const struct option *pfound = NULL;
|
||||
int indfound;
|
||||
|
||||
while (*s && *s != '=')
|
||||
s++;
|
||||
|
||||
/* Test all options for either exact match or abbreviated matches. */
|
||||
for (p = longopts, option_index = 0; p->name;
|
||||
p++, option_index++)
|
||||
if (!strncmp (p->name, nextchar, s - nextchar))
|
||||
{
|
||||
if (s - nextchar == strlen (p->name))
|
||||
{
|
||||
/* Exact match found. */
|
||||
pfound = p;
|
||||
indfound = option_index;
|
||||
exact = 1;
|
||||
break;
|
||||
}
|
||||
else if (pfound == NULL)
|
||||
{
|
||||
/* First nonexact match found. */
|
||||
pfound = p;
|
||||
indfound = option_index;
|
||||
}
|
||||
else
|
||||
/* Second nonexact match found. */
|
||||
ambig = 1;
|
||||
}
|
||||
|
||||
if (ambig && !exact)
|
||||
{
|
||||
if (opterr)
|
||||
fprintf (stderr, "%s: option `%s' is ambiguous\n",
|
||||
argv[0], argv[optind]);
|
||||
nextchar += strlen (nextchar);
|
||||
optind++;
|
||||
return '?';
|
||||
}
|
||||
|
||||
if (pfound != NULL)
|
||||
{
|
||||
option_index = indfound;
|
||||
optind++;
|
||||
if (*s)
|
||||
{
|
||||
/* Don't test has_arg with >, because some C compilers don't
|
||||
allow it to be used on enums. */
|
||||
if (pfound->has_arg)
|
||||
optarg = s + 1;
|
||||
else
|
||||
{
|
||||
if (opterr)
|
||||
{
|
||||
if (argv[optind - 1][1] == '-')
|
||||
/* --option */
|
||||
fprintf (stderr,
|
||||
"%s: option `--%s' doesn't allow an argument\n",
|
||||
argv[0], pfound->name);
|
||||
else
|
||||
/* +option or -option */
|
||||
fprintf (stderr,
|
||||
"%s: option `%c%s' doesn't allow an argument\n",
|
||||
argv[0], argv[optind - 1][0], pfound->name);
|
||||
}
|
||||
nextchar += strlen (nextchar);
|
||||
return '?';
|
||||
}
|
||||
}
|
||||
else if (pfound->has_arg == 1)
|
||||
{
|
||||
if (optind < argc)
|
||||
optarg = argv[optind++];
|
||||
else
|
||||
{
|
||||
if (opterr)
|
||||
fprintf (stderr, "%s: option `%s' requires an argument\n",
|
||||
argv[0], argv[optind - 1]);
|
||||
nextchar += strlen (nextchar);
|
||||
return '?';
|
||||
}
|
||||
}
|
||||
nextchar += strlen (nextchar);
|
||||
if (longind != NULL)
|
||||
*longind = option_index;
|
||||
if (pfound->flag)
|
||||
{
|
||||
*(pfound->flag) = pfound->val;
|
||||
return 0;
|
||||
}
|
||||
return pfound->val;
|
||||
}
|
||||
/* Can't find it as a long option. If this is not getopt_long_only,
|
||||
or the option starts with '--' or is not a valid short
|
||||
option, then it's an error.
|
||||
Otherwise interpret it as a short option. */
|
||||
if (!long_only || argv[optind][1] == '-'
|
||||
#ifdef GETOPT_COMPAT
|
||||
|| argv[optind][0] == '+'
|
||||
#endif /* GETOPT_COMPAT */
|
||||
|| my_index (optstring, *nextchar) == NULL)
|
||||
{
|
||||
if (opterr)
|
||||
{
|
||||
if (argv[optind][1] == '-')
|
||||
/* --option */
|
||||
fprintf (stderr, "%s: unrecognized option `--%s'\n",
|
||||
argv[0], nextchar);
|
||||
else
|
||||
/* +option or -option */
|
||||
fprintf (stderr, "%s: unrecognized option `%c%s'\n",
|
||||
argv[0], argv[optind][0], nextchar);
|
||||
}
|
||||
nextchar += strlen (nextchar);
|
||||
optind++;
|
||||
return '?';
|
||||
}
|
||||
}
|
||||
|
||||
/* Look at and handle the next option-character. */
|
||||
|
||||
{
|
||||
char c = *nextchar++;
|
||||
char *temp = my_index (optstring, c);
|
||||
|
||||
/* Increment `optind' when we start to process its last character. */
|
||||
if (*nextchar == '\0')
|
||||
optind++;
|
||||
|
||||
if (temp == NULL || c == ':')
|
||||
{
|
||||
if (opterr)
|
||||
{
|
||||
if (c < 040 || c >= 0177)
|
||||
fprintf (stderr, "%s: unrecognized option, character code 0%o\n",
|
||||
argv[0], c);
|
||||
else
|
||||
fprintf (stderr, "%s: unrecognized option `-%c'\n", argv[0], c);
|
||||
}
|
||||
return '?';
|
||||
}
|
||||
if (temp[1] == ':')
|
||||
{
|
||||
if (temp[2] == ':')
|
||||
{
|
||||
/* This is an option that accepts an argument optionally. */
|
||||
if (*nextchar != '\0')
|
||||
{
|
||||
optarg = nextchar;
|
||||
optind++;
|
||||
}
|
||||
else
|
||||
optarg = 0;
|
||||
nextchar = NULL;
|
||||
}
|
||||
else
|
||||
{
|
||||
/* This is an option that requires an argument. */
|
||||
if (*nextchar != 0)
|
||||
{
|
||||
optarg = nextchar;
|
||||
/* If we end this ARGV-element by taking the rest as an arg,
|
||||
we must advance to the next element now. */
|
||||
optind++;
|
||||
}
|
||||
else if (optind == argc)
|
||||
{
|
||||
if (opterr)
|
||||
fprintf (stderr, "%s: option `-%c' requires an argument\n",
|
||||
argv[0], c);
|
||||
c = '?';
|
||||
}
|
||||
else
|
||||
/* We already incremented `optind' once;
|
||||
increment it again when taking next ARGV-elt as argument. */
|
||||
optarg = argv[optind++];
|
||||
nextchar = NULL;
|
||||
}
|
||||
}
|
||||
return c;
|
||||
}
|
||||
}
|
||||
|
||||
int
|
||||
getopt (argc, argv, optstring)
|
||||
int argc;
|
||||
char *const *argv;
|
||||
const char *optstring;
|
||||
{
|
||||
return _getopt_internal (argc, argv, optstring,
|
||||
(const struct option *) 0,
|
||||
(int *) 0,
|
||||
0);
|
||||
}
|
||||
|
||||
#ifdef TEST
|
||||
|
||||
/* Compile with -DTEST to make an executable for use in testing
|
||||
the above definition of `getopt'. */
|
||||
|
||||
int
|
||||
main (argc, argv)
|
||||
int argc;
|
||||
char **argv;
|
||||
{
|
||||
int c;
|
||||
int digit_optind = 0;
|
||||
|
||||
while (1)
|
||||
{
|
||||
int this_option_optind = optind ? optind : 1;
|
||||
|
||||
c = getopt (argc, argv, "abc:d:0123456789");
|
||||
if (c == EOF)
|
||||
break;
|
||||
|
||||
switch (c)
|
||||
{
|
||||
case '0':
|
||||
case '1':
|
||||
case '2':
|
||||
case '3':
|
||||
case '4':
|
||||
case '5':
|
||||
case '6':
|
||||
case '7':
|
||||
case '8':
|
||||
case '9':
|
||||
if (digit_optind != 0 && digit_optind != this_option_optind)
|
||||
printf ("digits occur in two different argv-elements.\n");
|
||||
digit_optind = this_option_optind;
|
||||
printf ("option %c\n", c);
|
||||
break;
|
||||
|
||||
case 'a':
|
||||
printf ("option a\n");
|
||||
break;
|
||||
|
||||
case 'b':
|
||||
printf ("option b\n");
|
||||
break;
|
||||
|
||||
case 'c':
|
||||
printf ("option c with value `%s'\n", optarg);
|
||||
break;
|
||||
|
||||
case '?':
|
||||
break;
|
||||
|
||||
default:
|
||||
printf ("?? getopt returned character code 0%o ??\n", c);
|
||||
}
|
||||
}
|
||||
|
||||
if (optind < argc)
|
||||
{
|
||||
printf ("non-option ARGV-elements: ");
|
||||
while (optind < argc)
|
||||
printf ("%s ", argv[optind++]);
|
||||
printf ("\n");
|
||||
}
|
||||
|
||||
exit (0);
|
||||
}
|
||||
|
||||
#endif /* TEST */
|
113
gnu/usr.bin/grep/getopt.h
Normal file
113
gnu/usr.bin/grep/getopt.h
Normal file
@ -0,0 +1,113 @@
|
||||
/* Declarations for getopt.
|
||||
Copyright (C) 1989, 1990, 1991, 1992 Free Software Foundation, Inc.
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2, or (at your option)
|
||||
any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License
|
||||
along with this program; if not, write to the Free Software
|
||||
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. */
|
||||
|
||||
#ifndef _GETOPT_H
|
||||
#define _GETOPT_H 1
|
||||
|
||||
/* For communication from `getopt' to the caller.
|
||||
When `getopt' finds an option that takes an argument,
|
||||
the argument value is returned here.
|
||||
Also, when `ordering' is RETURN_IN_ORDER,
|
||||
each non-option ARGV-element is returned here. */
|
||||
|
||||
extern char *optarg;
|
||||
|
||||
/* Index in ARGV of the next element to be scanned.
|
||||
This is used for communication to and from the caller
|
||||
and for communication between successive calls to `getopt'.
|
||||
|
||||
On entry to `getopt', zero means this is the first call; initialize.
|
||||
|
||||
When `getopt' returns EOF, this is the index of the first of the
|
||||
non-option elements that the caller should itself scan.
|
||||
|
||||
Otherwise, `optind' communicates from one call to the next
|
||||
how much of ARGV has been scanned so far. */
|
||||
|
||||
extern int optind;
|
||||
|
||||
/* Callers store zero here to inhibit the error message `getopt' prints
|
||||
for unrecognized options. */
|
||||
|
||||
extern int opterr;
|
||||
|
||||
/* Describe the long-named options requested by the application.
|
||||
The LONG_OPTIONS argument to getopt_long or getopt_long_only is a vector
|
||||
of `struct option' terminated by an element containing a name which is
|
||||
zero.
|
||||
|
||||
The field `has_arg' is:
|
||||
no_argument (or 0) if the option does not take an argument,
|
||||
required_argument (or 1) if the option requires an argument,
|
||||
optional_argument (or 2) if the option takes an optional argument.
|
||||
|
||||
If the field `flag' is not NULL, it points to a variable that is set
|
||||
to the value given in the field `val' when the option is found, but
|
||||
left unchanged if the option is not found.
|
||||
|
||||
To have a long-named option do something other than set an `int' to
|
||||
a compiled-in constant, such as set a value from `optarg', set the
|
||||
option's `flag' field to zero and its `val' field to a nonzero
|
||||
value (the equivalent single-letter option character, if there is
|
||||
one). For long options that have a zero `flag' field, `getopt'
|
||||
returns the contents of the `val' field. */
|
||||
|
||||
struct option
|
||||
{
|
||||
#if __STDC__
|
||||
const char *name;
|
||||
#else
|
||||
char *name;
|
||||
#endif
|
||||
/* has_arg can't be an enum because some compilers complain about
|
||||
type mismatches in all the code that assumes it is an int. */
|
||||
int has_arg;
|
||||
int *flag;
|
||||
int val;
|
||||
};
|
||||
|
||||
/* Names for the values of the `has_arg' field of `struct option'. */
|
||||
|
||||
enum _argtype
|
||||
{
|
||||
no_argument,
|
||||
required_argument,
|
||||
optional_argument
|
||||
};
|
||||
|
||||
#if __STDC__
|
||||
extern int getopt (int argc, char *const *argv, const char *shortopts);
|
||||
extern int getopt_long (int argc, char *const *argv, const char *shortopts,
|
||||
const struct option *longopts, int *longind);
|
||||
extern int getopt_long_only (int argc, char *const *argv,
|
||||
const char *shortopts,
|
||||
const struct option *longopts, int *longind);
|
||||
|
||||
/* Internal only. Users should not call this directly. */
|
||||
extern int _getopt_internal (int argc, char *const *argv,
|
||||
const char *shortopts,
|
||||
const struct option *longopts, int *longind,
|
||||
int long_only);
|
||||
#else /* not __STDC__ */
|
||||
extern int getopt ();
|
||||
extern int getopt_long ();
|
||||
extern int getopt_long_only ();
|
||||
|
||||
extern int _getopt_internal ();
|
||||
#endif /* not __STDC__ */
|
||||
|
||||
#endif /* _GETOPT_H */
|
234
gnu/usr.bin/grep/grep.1
Normal file
234
gnu/usr.bin/grep/grep.1
Normal file
@ -0,0 +1,234 @@
|
||||
.TH GREP 1 "1988 December 13" "GNU Project" \" -*- nroff -*-
|
||||
.UC 4
|
||||
.SH NAME
|
||||
grep, egrep \- print lines matching a regular expression
|
||||
.SH SYNOPSIS
|
||||
.B grep
|
||||
[
|
||||
.B \-CVbchilnsvwx
|
||||
] [
|
||||
.BI \- num
|
||||
] [
|
||||
.B \-AB
|
||||
.I num
|
||||
] [ [
|
||||
.B \-e
|
||||
]
|
||||
.I expr
|
||||
|
|
||||
.B \-f
|
||||
.I file
|
||||
] [
|
||||
.I "files ..."
|
||||
]
|
||||
.SH DESCRIPTION
|
||||
.I Grep
|
||||
searches the files listed in the arguments (or standard
|
||||
input if no files are given) for all lines that contain a match for
|
||||
the given
|
||||
.IR expr .
|
||||
If any lines match, they are printed.
|
||||
.PP
|
||||
Also, if any matches were found,
|
||||
.I grep
|
||||
exits with a status of 0, but if no matches were found it exits
|
||||
with a status of 1. This is useful for building shell scripts that
|
||||
use
|
||||
.I grep
|
||||
as a condition for, for example, the
|
||||
.I if
|
||||
statement.
|
||||
.PP
|
||||
When invoked as
|
||||
.I egrep
|
||||
the syntax of the
|
||||
.I expr
|
||||
is slightly different; See below.
|
||||
.br
|
||||
.SH "REGULAR EXPRESSIONS"
|
||||
.RS 2.5i
|
||||
.ta 1i 2i
|
||||
.sp
|
||||
.ti -2.0i
|
||||
(grep) (egrep) (explanation)
|
||||
.sp
|
||||
.ti -2.0i
|
||||
\fIc\fP \fIc\fP a single (non-meta) character matches itself.
|
||||
.sp
|
||||
.ti -2.0i
|
||||
\&. . matches any single character except newline.
|
||||
.sp
|
||||
.ti -2.0i
|
||||
\\? ? postfix operator; preceeding item is optional.
|
||||
.sp
|
||||
.ti -2.0i
|
||||
\(** \(** postfix operator; preceeding item 0 or
|
||||
more times.
|
||||
.sp
|
||||
.ti -2.0i
|
||||
\\+ + postfix operator; preceeding item 1 or
|
||||
more times.
|
||||
.sp
|
||||
.ti -2.0i
|
||||
\\| | infix operator; matches either
|
||||
argument.
|
||||
.sp
|
||||
.ti -2.0i
|
||||
^ ^ matches the empty string at the beginning of a line.
|
||||
.sp
|
||||
.ti -2.0i
|
||||
$ $ matches the empty string at the end of a line.
|
||||
.sp
|
||||
.ti -2.0i
|
||||
\\< \\< matches the empty string at the beginning of a word.
|
||||
.sp
|
||||
.ti -2.0i
|
||||
\\> \\> matches the empty string at the end of a word.
|
||||
.sp
|
||||
.ti -2.0i
|
||||
[\fIchars\fP] [\fIchars\fP] match any character in the given class; if the
|
||||
first character after [ is ^, match any character
|
||||
not in the given class; a range of characters may
|
||||
be specified by \fIfirst\-last\fP; for example, \\W
|
||||
(below) is equivalent to the class [^A\-Za\-z0\-9]
|
||||
.sp
|
||||
.ti -2.0i
|
||||
\\( \\) ( ) parentheses are used to override operator precedence.
|
||||
.sp
|
||||
.ti -2.0i
|
||||
\\\fIdigit\fP \\\fIdigit\fP \\\fIn\fP matches a repeat of the text
|
||||
matched earlier in the regexp by the subexpression inside the nth
|
||||
opening parenthesis.
|
||||
.sp
|
||||
.ti -2.0i
|
||||
\\ \\ any special character may be preceded
|
||||
by a backslash to match it literally.
|
||||
.sp
|
||||
.ti -2.0i
|
||||
(the following are for compatibility with GNU Emacs)
|
||||
.sp
|
||||
.ti -2.0i
|
||||
\\b \\b matches the empty string at the edge of a word.
|
||||
.sp
|
||||
.ti -2.0i
|
||||
\\B \\B matches the empty string if not at the edge of a word.
|
||||
.sp
|
||||
.ti -2.0i
|
||||
\\w \\w matches word-constituent characters (letters & digits).
|
||||
.sp
|
||||
.ti -2.0i
|
||||
\\W \\W matches characters that are not word-constituent.
|
||||
.RE
|
||||
.PP
|
||||
Operator precedence is (highest to lowest) ?, \(**, and +, concatenation,
|
||||
and finally |. All other constructs are syntactically identical to
|
||||
normal characters. For the truly interested, the file dfa.c describes
|
||||
(and implements) the exact grammar understood by the parser.
|
||||
.SH OPTIONS
|
||||
.TP
|
||||
.BI \-A " num"
|
||||
print <num> lines of context after every matching line
|
||||
.TP
|
||||
.BI \-B " num"
|
||||
print
|
||||
.I num
|
||||
lines of context before every matching line
|
||||
.TP
|
||||
.B \-C
|
||||
print 2 lines of context on each side of every match
|
||||
.TP
|
||||
.BI \- num
|
||||
print
|
||||
.I num
|
||||
lines of context on each side of every match
|
||||
.TP
|
||||
.B \-V
|
||||
print the version number on the diagnostic output
|
||||
.TP
|
||||
.B \-b
|
||||
print every match preceded by its byte offset
|
||||
.TP
|
||||
.B \-c
|
||||
print a total count of matching lines only
|
||||
.TP
|
||||
.BI \-e " expr"
|
||||
search for
|
||||
.IR expr ;
|
||||
useful if
|
||||
.I expr
|
||||
begins with \-
|
||||
.TP
|
||||
.BI \-f " file"
|
||||
search for the expression contained in
|
||||
.I file
|
||||
.TP
|
||||
.B \-h
|
||||
don't display filenames on matches
|
||||
.TP
|
||||
.B \-i
|
||||
ignore case difference when comparing strings
|
||||
.TP
|
||||
.B \-l
|
||||
list files containing matches only
|
||||
.TP
|
||||
.B \-n
|
||||
print each match preceded by its line number
|
||||
.TP
|
||||
.B \-s
|
||||
run silently producing no output except error messages
|
||||
.TP
|
||||
.B \-v
|
||||
print only lines that contain no matches for the <expr>
|
||||
.TP
|
||||
.B \-w
|
||||
print only lines where the match is a complete word
|
||||
.TP
|
||||
.B \-x
|
||||
print only lines where the match is a whole line
|
||||
.SH "SEE ALSO"
|
||||
emacs(1), ed(1), sh(1),
|
||||
.I "GNU Emacs Manual"
|
||||
.SH INCOMPATIBILITIES
|
||||
The following incompatibilities with UNIX
|
||||
.I grep
|
||||
exist:
|
||||
.PP
|
||||
.RS 0.5i
|
||||
The context-dependent meaning of \(** is not quite the same (grep only).
|
||||
.PP
|
||||
.B \-b
|
||||
prints a byte offset instead of a block offset.
|
||||
.PP
|
||||
The {\fIm,n\fP} construct of System V grep is not implemented.
|
||||
.PP
|
||||
.SH BUGS
|
||||
GNU \fIe?grep\fP has been thoroughly debugged and tested over a period
|
||||
of several years; we think it's a reliable beast or we wouldn't
|
||||
distribute it. If by some fluke of the universe you discover a bug,
|
||||
send a detailed description (including options, regular expressions,
|
||||
and a copy of an input file that can reproduce it) to mike@ai.mit.edu.
|
||||
.PP
|
||||
.SH AUTHORS
|
||||
Mike Haertel wrote the deterministic regexp code and the bulk
|
||||
of the program.
|
||||
.PP
|
||||
James A. Woods is responsible for the hybridized search strategy
|
||||
of using Boyer-Moore-Gosper fixed-string search as a filter
|
||||
before calling the general regexp matcher.
|
||||
.PP
|
||||
Arthur David Olson contributed code that finds fixed strings for
|
||||
the aforementioned BMG search for a large class of regexps.
|
||||
.PP
|
||||
Richard Stallman wrote the backtracking regexp matcher that is used
|
||||
for \\\fIdigit\fP backreferences, as well as the GNU getopt. The
|
||||
backtracking matcher was originally written for GNU Emacs.
|
||||
.PP
|
||||
D. A. Gwyn wrote the C alloca emulation that is provided so
|
||||
System V machines can run this program. (Alloca is used only
|
||||
by RMS' backtracking matcher, and then only rarely, so there
|
||||
is no loss if your machine doesn't have a "real" alloca.)
|
||||
.PP
|
||||
Scott Anderson and Henry Spencer designed the regression tests
|
||||
used in the "regress" script.
|
||||
.PP
|
||||
Paul Placeway wrote the original version of this manual page.
|
1007
gnu/usr.bin/grep/grep.c
Normal file
1007
gnu/usr.bin/grep/grep.c
Normal file
File diff suppressed because it is too large
Load Diff
32
gnu/usr.bin/grep/tests/khadafy.lines
Normal file
32
gnu/usr.bin/grep/tests/khadafy.lines
Normal file
@ -0,0 +1,32 @@
|
||||
1) Muammar Qaddafi
|
||||
2) Mo'ammar Gadhafi
|
||||
3) Muammar Kaddafi
|
||||
4) Muammar Qadhafi
|
||||
5) Moammar El Kadhafi
|
||||
6) Muammar Gadafi
|
||||
7) Mu'ammar al-Qadafi
|
||||
8) Moamer El Kazzafi
|
||||
9) Moamar al-Gaddafi
|
||||
10) Mu'ammar Al Qathafi
|
||||
11) Muammar Al Qathafi
|
||||
12) Mo'ammar el-Gadhafi
|
||||
13) Moamar El Kadhafi
|
||||
14) Muammar al-Qadhafi
|
||||
15) Mu'ammar al-Qadhdhafi
|
||||
16) Mu'ammar Qadafi
|
||||
17) Moamar Gaddafi
|
||||
18) Mu'ammar Qadhdhafi
|
||||
19) Muammar Khaddafi
|
||||
20) Muammar al-Khaddafi
|
||||
21) Mu'amar al-Kadafi
|
||||
22) Muammar Ghaddafy
|
||||
23) Muammar Ghadafi
|
||||
24) Muammar Ghaddafi
|
||||
25) Muamar Kaddafi
|
||||
26) Muammar Quathafi
|
||||
27) Muammar Gheddafi
|
||||
28) Muamar Al-Kaddafi
|
||||
29) Moammar Khadafy
|
||||
30) Moammar Qudhafi
|
||||
31) Mu'ammar al-Qaddafi
|
||||
32) Mulazim Awwal Mu'ammar Muhammad Abu Minyar al-Qadhafi
|
1
gnu/usr.bin/grep/tests/khadafy.regexp
Normal file
1
gnu/usr.bin/grep/tests/khadafy.regexp
Normal file
@ -0,0 +1 @@
|
||||
M[ou]'?am+[ae]r .*([AEae]l[- ])?[GKQ]h?[aeu]+([dtz][dhz]?)+af[iy]
|
Loading…
Reference in New Issue
Block a user