-P in gnugrepland means PCRE, which we do not support. We may eventually
support it if onigmo ends up getting imported as a more performant regex
implementation, and we can re-add it properly in these places (and more)
when that time comes.
The optstr change is a functional nop; the case was not explicitly handled,
thus ending in usage() anyways.
Reported by: Vladimir Misev (via twitter)
It seems that the number of lines is no longer an optional parameter to
the -C flag. Document it accordingly both in the manual page and the
usage message.
Reviewed by: yuripv
Approved by: yuripv
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D28509
The null pattern semantics were terrible because I tried to match gnugrep,
but I got it wrong. Let's unwind that:
- The null pattern should match every line if neither -w nor -x.
- The null pattern should match empty lines if -x.
- The null pattern should not match any lines if -w.
The first two will stop processing (shortcut) even if additional patterns
are specified. In any other case, we will continue processing other
patterns. If no other patterns are specified beside a null pattern, then
we match if neither -w nor -x or set and do not match if either of those
are specified.
The justification for -w is that it should match on a whole word, but the
null pattern deos not have a whole word to match on.
Empty pattern files should never match anything, and more importantly, -v
should cause everything to be written.
PR: 253209
MFC-after: 4 days
We know up front how many items we can have in the queue (-B/Bflag), so
pay the cost of those particular allocations early on.
The reduced queue maintenance overhead seemed to yield about an ~8%
improvement for my earlier `grep -C8 -r closefrom .` test.
MFC after: 2 weeks
This was introduced and then disabled by default primarily to avoid dealing
with bugs in libgnuregex. rS363823 switched to using libregex for it, so
let's just rip the option out now so we can make sure we're getting tested
with libregex via bsdgrep.
Reviewed by: emaste
Differential Revision: https://reviews.freebsd.org/D27476
libregex is incomplete, but it's a bit less buggy than the in-base
libgnuregex and mostly OK.
While here, rename -DIWTH_GNU -> -DWITH_GNU_COMPAT; the option implies
that we're compatible with the GNU counterpart, not that we're including GNU
anything.
When an empty pattern is encountered in the pattern list, I had previously
broken bsdgrep to count that as a "match all" and ignore any other patterns
in the list. This commit rectifies that mistake, among others:
- The -v flag semantics were not quite right; lines matched should have been
counted differently based on whether the -v flag was set or not. procline
now definitively returns whether it's matched or not, and interpreting
that result has been kicked up a level.
- Empty patterns with the -x flag was broken similarly to empty patterns
with the -w flag. The former is a whole-line match and should be more
strict, only matching blank lines. No -x and no -w will will match the
empty string at the beginning of each line.
- The exit code with -L was broken, w.r.t. modern grep. Modern grap will
exit(0) if any file that didn't match was output, so our interpretation
was simply backwards. The new interpretation makes sense to me.
Tests updated and added to try and catch some of this.
This misbehavior was found by autoconf while fixing ports found in PR 229925
expecting either a more sane or a more GNU-like sed.
MFC after: 1 week
This 'r' case should have belonged to the switch in the first place, but
I had somehow missed the switch when initially adding the rgrep link. The
zgrep script later came along and faithfully left this case standing alone,
so we will now go ahead and join it.
Nearby comment also adjusted a tad bit for wording and style.
Reported by: Daniel Ebdrup
MFC after: 3 days
Neither procfile nor grep_tree return anything meaningful to their callers.
None of the callers actually care about how many lines were matched in all
of the files they processed; it's all about "did anything match?"
This is generally just a light refactoring to remind me of what actually
matters as I'm rewriting these bits to care less about 'stuff'.
GNU grep as in actually in base does not have any translations support
compiled in, so no functionnality loss.
We do support 193 locales in base, we will never catch up on that number of
translation with bsd grep.
Removing NLS support make bsd grep consistent with the other binaries in base
which are not translated, and also reduce a little bit the code.
Reviewed by: kevans
Approved by: kevans
Discussed with: kevans @BSDCan
Differential Revision: https://reviews.freebsd.org/D15682
A version of this patch was originally sent to me by se@, matching behavior
from newer versions of GNU grep.
While there have been some differences of opinion on whether stdin should be
closed or not after depleting it in process of -f, I've opted to leave stdin
open and just let the later matching stuff fail and result in a no-match.
I'm not married to the current behavior- it was generally chosen since we
are adopting this in particular from GNU grep, and I would like to stay
consistent without a strong argument to the contrary. The current behavior
isn't technically wrong, it's just fairly unfriendly to the developer-user
of grep that may not realize their usage is trivially invalid.
Submitted by: se
It was an old TRE that had plenty of bugs and no performance gain over
regex(3). I disabled it by default in r323615, and there was some confusion
about what the knob does- likely due to poor naming on my part- to the tune
of "well, it sounds like it should speed things up" (mentioned by multiple
people).
To compound this, I have no intention of maintaining a second regex
implementation. If someone would like to step up and volunteer to maintain a
lean-and-mean implementation for grep, this is OK, but we have very few
volunteers to maintain even our primary regex implementation.
Compression was removed so #2 goes away and everything else needs renumbered
to match, and the usage string was also updated due to removed options.
X-MFC-With: r332995
Compression support is now handled by an external script, remove it from the
bsdgrep(1) utility.
This removes the support for -Z -J -X and -M
Note: that it matches the changes in newer GNU grep
Reviewed by: kevans
Approved by: kevans
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D15197
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.
The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.
No functional change intended.
fgrep/grep -F will error out at runtime if compiled with a regex(3)
that does not define REG_NOSPEC or REG_LITERAL. glibc is one such regex(3)
implementation, and as it turns out they don't support literal matching at
all.
Provide a primitive literal matcher for use with glibc and other
implementations that don't support literal matching so that we don't
completely lose fgrep/grep -F if compiled against libgnuregex on stable/10,
stable/11, or other systems that we don't necessarily support.
This is a wholly unoptimized implementation with no plans to optimize it as
of now. This is due to both its use-case being primarily on unsupported
systems in the near-distant future and that it's reinventing the wheel that
we already have available as a feature of regex(3).
Reviewed by: cem, emaste, ngie
Approved by: emaste (mentor)
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D12056
Correct a couple of minor BSD grep assumptions that are valid for line
processing but not future chunk-based processing.
Submitted by: Kyle Evans <kevans91@ksu.edu>
Reviewed by: bapt, cem
Differential Revision: https://reviews.freebsd.org/D10824
Previously, when given a negative -A/-B/-C argument bsdgrep would
overflow the respective context flag(s) and exhibited surprising
behavior.
Fix this by removing unsignedness of Aflag/Bflag and erroring out if
we're given a value < 0. Also adjust the type used to track 'tail'
context in procfile() so that it accurately reflects the Aflag value
rather than overflowing and losing trailing context.
This also fixes an inconsistency previously existing between -n and
-C "n" behavior. They are now both limited to LLONG_MAX, to be
consistent.
Add some test cases to make sure grep errors out properly for both
negative context values as well as non-numeric context values rather
than giving bogus matches.
Submitted by: Kyle Evans <kevans91@ksu.edu>
Reviewed by: cem
Differential Revision: https://reviews.freebsd.org/D10675
-w flag matching with an empty pattern was generally 'broken', allowing
matches to occur on any line whether or not it actually matches -w
criteria.
This fix required a good amount of refactoring to address. procline()
is altered to *only* process the line and return whether it was a match
or not, necessary to be able to short-circuit the whole function in case
of this matchall flag. -m flag handling is moved out as well because it
suffers from the same fate as context handling if we bypass any actual
pattern matching.
The matching context (matches, mostly) didn't previously exist outside
of procline(), so we go ahead and create context object for file
processing bits to pass around. grep_printline() was created due to
this, for the scenarios where the matches don't actually matter and we
just want to print a line or two, a la flushing the context queue and
no -o or --color specified.
Damage from this broken behavior would have been mitigated by the fact
that it is unlikely users would invoke grep -w with an empty pattern.
This was identified while checking PR 105221 for problems it this may
cause in BSD grep, but PR 105221 is *not* a report of this behavior.
Submitted by: Kyle Evans <kevans91 at ksu.edu>
Differential Revision: https://reviews.freebsd.org/D10433
As reported in r218614 it's useful to have an indication of whether or not
BSD grep was built with GNU_GREP_COMPAT.
Submitted by: Kyle Evans <kevans91 at ksu.edu>
Reported by: mandree
Differential Revision: https://reviews.freebsd.org/D10451
Bugs have been found in the fastmatch implementation as used in bsdgrep.
Some have been fixed (r316495) while fixes for others are in review
(D10098).
In comparison with the fastmatch implementation, Kyle Evans found that:
- regex(3)'s performance with literal expressions offers a speed
improvement over fastmatch
- regex(3)'s performance, both with simple BREs and EREs, seems to be
comparable
The regex implementation was imported in r226035, and the commit message
reports:
This is a temporary solution until the whole regex library is
not replaced so that BSD grep development can continue and the
backported code gets some review and testing. This change only
improves scalability slightly, there is no big performance boost
yet but several minor bugs have been found and fixed.
Introduce a WITH_/WITHOUT_BSD_GREP_FASTMATCH knob to support testing
of both approaches.
PR: 175314, 194823
Submitted by: Kyle Evans <kevans91 at ksu.edu>
Reviewed by: bdrewery (in part)
Differential Revision: https://reviews.freebsd.org/D10282
This is more sensible than the previous behaviour of grepping stdin,
and matches newer GNU grep behaviour.
PR: 216307
Submitted by: Kyle Evans <kevans91 at ksu.edu>
Reviewed by: cem, emaste, ngie
Relnotes: Yes
Differential Revision: https://reviews.freebsd.org/
-z treats input and output data as sequences of lines terminated by a
zero byte instead of a newline. This brings it more in line with GNU grep
and brings us closer to passing the current tests with BSD grep.
Submitted by: Kyle Evans <kevans91 at ksu.edu>
Reviewed by: cem
Relnotes: Yes
Differential Revision: https://reviews.freebsd.org/D10101
Bring a couple of changes from NetBSD:
queue.c (CVS Rev. 1.4. 1.5)
Fix memory leaks.
NULL does not need a cast.
grep.c (CVS Rev. 1.6)
Use the more portable getline.
Obtained from: NetBSD
MFC after: 3 days
In addition to adding `static' where possible:
- bin/date: Move `retval' into extern.h to make it visible to date.c.
- bin/ed: Move globally used variables into ed.h.
- sbin/camcontrol: Move `verbose' into camcontrol.h and fix shadow warnings.
- usr.bin/calendar: Remove unneeded variables.
- usr.bin/chat: Make `line' local instead of global.
- usr.bin/elfdump: Comment out unneeded function.
- usr.bin/rlogin: Use _Noreturn instead of __dead2.
- usr.bin/tset: Pull `Ospeed' into extern.h.
- usr.sbin/mfiutil: Put global variables in mfiutil.h.
- usr.sbin/pkg: Remove unused `os_corres'.
- usr.sbin/quotaon, usr.sbin/repquota: Remove unused `qfname'.
- Allow disabling bzip2 support with WITHOUT_BZIP2
- Fix handling patterns that start with a dot
- Remove superfluous semicolon
Approved by: delphij (mentor)
backported that was written for the TRE integration project in Google
Summer of Code 2011. This is a temporary solution until the whole
regex library is not replaced so that BSD grep development can continue
and the backported code gets some review and testing. This change only
improves scalability slightly, there is no big performance boost yet
but several minor bugs have been found and fixed.
Approved by: delphij (mentor)
Sposored by: Google Summer of Code 2011
MFC after: 1 week
- Makefile nit
- Add more CVS/SVN keywords to make it easier to track changes from NetBSD
in case they add further improvements
Approved by: delphij (mentor)
Obtained from: The NetBSD Project
- Imply -h if single file is grepped, this is the GNU behaviour
This is already done by code above the change and have caused a regression
since this instance of code does not check Hflag.
Reported by: davidxu
Pointy hat to: delphij
former may be safer but in this case it doesn't add extra
safety [1]
- Fix -w option [2]
- Fix handling of GREP_OPTIONS [3]
- Fix --line-buffered
- Make stdin input imply --line-buffered so that tail -f can be piped
to grep [4]
- Imply -h if single file is grepped, this is the GNU behaviour
- Reduce locking overhead to gain some more performance [5]
- Inline some functions to help the compiler better optimize the code
- Use shortcut for empty files [6]
PR: bin/149425 [6]
Prodded by: jilles [1]
Reported by: Alex Kozlov <spam@rm-rf.kiev.ua> [2] [3],
swell.k@gmail.com [2],
poyopoyo@puripuri.plala.or.jp [4]
Submitted by: scf [5],
Shuichi KITAGUCHI <ki@hh.iij4u.or.jp> [6]
Approved by: delphij (mentor)
and exclusion patterns [1]
- Some improvements on the exiting code, like replacing memcpy with
strlcpy/strcpy
Approved by: delphij (mentor)
Pointed out by: bf [1], des [1]
or if forced mode is specified [1]
- While here, add some alternative names for the options and make then
case-insensitive
- Fix -q and -l behaviour [2]
- Some small changes to make the code easier to review
Submitted by: swell.k@gmail.com [1],
dougb [2]
Approved by: delphij (mentor)
- Explicitly pre-zero memory for fts_open parameters.
- Don't test against directory patterns when we are testing direct
leaf of current directory.
While I'm there plug a few of memory leaks.