Commit Graph

168 Commits

Author SHA1 Message Date
bdrewery
931a95a66b Fix exit code for mismatches after r333013.
The -c flag still does the wrong thing versus the older version due to
lack of pipefail support.

Reported by:	antoine
2018-05-24 22:15:47 +00:00
kevans
e3f7dbda44 Standardize SPDX tag on files I've added 2018-05-09 16:52:28 +00:00
kevans
3ca335caec Remove "All Rights Reserved" on files that I hold sole copyright on
See r333391 for more detail; in summary: it holds no weight and may be
removed.
2018-05-09 16:44:19 +00:00
kevans
50d8d78065 bsdgrep: Allow "-" to be passed to -f to mean "standard input"
A version of this patch was originally sent to me by se@, matching behavior
from newer versions of GNU grep.

While there have been some differences of opinion on whether stdin should be
closed or not after depleting it in process of -f, I've opted to leave stdin
open and just let the later matching stuff fail and result in a no-match.
I'm not married to the current behavior- it was generally chosen since we
are adopting this in particular from GNU grep, and I would like to stay
consistent without a strong argument to the contrary. The current behavior
isn't technically wrong, it's just fairly unfriendly to the developer-user
of grep that may not realize their usage is trivially invalid.

Submitted by:	se
2018-05-08 03:53:46 +00:00
kevans
da653eef52 bsdgrep: annihilate our in-tree TRE, previously disabled by default
It was an old TRE that had plenty of bugs and no performance gain over
regex(3). I disabled it by default in r323615, and there was some confusion
about what the knob does- likely due to poor naming on my part- to the tune
of "well, it sounds like it should speed things up" (mentioned by multiple
people).

To compound this, I have no intention of maintaining a second regex
implementation. If someone would like to step up and volunteer to maintain a
lean-and-mean implementation for grep, this is OK, but we have very few
volunteers to maintain even our primary regex implementation.
2018-05-04 03:13:25 +00:00
kevans
2e45301e4f zgrep(1): Note that -r/-R are not currently supported.
This is better behavior than just silently doing the wrong thing. We do not
currently have plans to support -r/-R with the compression-enabled greps.

Reported by:	jilles
2018-05-03 02:56:13 +00:00
kevans
315fc5a7dc bsdgrep: Adjust a missed NLS reference that was invalidated by recent work
Submitted by:	Dan McGregor <dan.mcgregor@usask.ca>
2018-05-02 15:45:31 +00:00
bapt
479a325d65 zgrep.sh: Add forgotten ';' and remove set -e which prevents looping over files
not matching
2018-04-25 21:01:02 +00:00
bapt
88f908a092 zgrep.sh: remove now useless shift 2018-04-25 20:55:18 +00:00
bapt
e83d1023bf zgrep: small improvements
* Use slightly more efficient method to determine the name of the program
called [1]
* Use nicer form to loop over arguments [1]
* add special support for --version along with -V previously added by kevans

Reported by:	jilles@ [1]
2018-04-25 20:52:17 +00:00
kevans
6df27523c9 <compress>grep: Special case the -V flag
In case we need version information of the ultimately chosen grep, allowing
`zgrep -V` to operate would be most helpful.
2018-04-25 18:59:29 +00:00
kevans
439f39456d bsdgrep(1): Sneak in some man page updates
- The --exclude{,-dir} and --include{,-dir} directives now match GNU
  behavior of being processed in order and latest matching directive wins

- --label was previously not really documented, and -L and -l did not
  indicate that --label applied to them

- The flags listed as being extensions to POSIX spec were not updated with
  the removal of compression-related flags

MFC after:	1 week
2018-04-25 16:28:51 +00:00
kevans
0bc264c02a bsdgrep: Update NLS catalogs after r332995
Compression was removed so #2 goes away and everything else needs renumbered
to match, and the usage string was also updated due to removed options.

X-MFC-With: r332995
2018-04-25 15:41:50 +00:00
bapt
a4e032fde9 Remove compression support from bsdgrep
Compression support is now handled by an external script, remove it from the
bsdgrep(1) utility.
This removes the support for -Z -J -X and -M

Note: that it matches the changes in newer GNU grep

Reviewed by:	kevans
Approved by:	kevans
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D15197
2018-04-25 14:40:15 +00:00
bapt
2f45771b4b Use a script wrapper for <compress>grep
Import the wrapper script from zstdgrep (written by wiz@netbsd.org)

Modify it to support more than just zstd (adding support for gzip,
lzma, xz and bzip2)

Write a simple manpage dedicated for it.

Only use that new wrapper both for gnu grep and bsd grep

Next step will be removing code related to compression format from bsdgrep

Reviewed by:	kevans
Approved by:	kevans
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D15193
2018-04-25 13:23:58 +00:00
kevans
5cbcde6d07 bsdgrep: Fix build failure WITHOUT_LZMA (incorrect bracket placement)
Submitted by:	sbruno
Reported by:	sbruno
2018-04-22 23:51:24 +00:00
kevans
334fa0aab4 bsdgrep: Use grep_strdup instead of grep_malloc+strcpy 2018-04-21 14:58:45 +00:00
kevans
985dd72ad9 bsdgrep: Fix --include/--exclude ordering issues
Prior to r332851:
* --exclude always win out over --include
* --exclude-dir always wins out over --include-dir

r332851 broke that behavior, resulting in:
* First of --exclude, --include wins
* First of --exclude-dir, --include-dir wins

As it turns out, both behaviors are wrong by modern grep standards- the
latest rule wins. e.g.:

`grep --exclude foo --include foo 'thing' foo`
foo is included

`grep --include foo --exclude foo 'thing' foo`
foo is excluded

As tested with GNU grep 3.1.

This commit makes bsdgrep follow this behavior.

Reported by:	se
2018-04-21 13:46:07 +00:00
kevans
564bfbdaa2 bsdgrep: if chain => switch
This makes some of this a little easier to follow (in my opinion).
2018-04-21 01:42:02 +00:00
kevans
ad41916557 bsdgrep: More trivial cleanup/style cleanup
We can avoid branching for these easily reduced patterns
2018-04-21 01:33:13 +00:00
kevans
00c090fd14 bsdgrep: Some light cleanup
There's no point checking for a bunch of file modes if we're not a
practicing believer of DIR_SKIP or DEV_SKIP.

This also reduces some style violations that were particularly ugly looking
when browsing through.
2018-04-21 01:02:35 +00:00
kevans
0dde510be5 bsdgrep: Break procmatches down a little bit more
Split the matching and non-matching cases out into their own functions to
reduce future complexity. As the name implies, procmatches will eventually
process more than one match itself in the future.
2018-04-20 18:06:03 +00:00
kevans
49168f0878 bsdgrep: Add some TODOs for future work on operating on chunks 2018-04-20 03:29:06 +00:00
kevans
68c7a8f091 bsdgrep: Clean up procmatches a little bit 2018-04-20 03:11:51 +00:00
kevans
1c4340834f bsdgrep: Split match processing out of procfile
procfile is getting kind of hairy, and it's not going to get better as we
correct some more bits that assume we process one line at a time.
2018-04-20 03:08:46 +00:00
kevans
fcbcde8be0 grep test: Fix copyright notice
The copyright notice was erroneously introduced as one from the NetBSD
foundation due to it being copied from a file in the NetBSD test suite, but
this file itself is not derived from or supplied with the NetBSD test suite.

MFC after:	3 days
2017-12-03 02:23:29 +00:00
pfg
7551d83c35 various: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:37:16 +00:00
bdrewery
a598c4b809 DIRDEPS_BUILD: Update dependencies.
Sponsored by:	Dell EMC Isilon
2017-10-31 00:07:04 +00:00
emaste
abccd61ee9 fastmatch.h: remove duplicate #defines
Reviewed by:	kevans
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D12375
2017-09-15 13:34:00 +00:00
kevans
17df4eb706 bsdgrep: add a primitive literal matcher
fgrep/grep -F will error out at runtime if compiled with a regex(3)
that does not define REG_NOSPEC or REG_LITERAL. glibc is one such regex(3)
implementation, and as it turns out they don't support literal matching at
all.

Provide a primitive literal matcher for use with glibc and other
implementations that don't support literal matching so that we don't
completely lose fgrep/grep -F if compiled against libgnuregex on stable/10,
stable/11, or other systems that we don't necessarily support.

This is a wholly unoptimized implementation with no plans to optimize it as
of now. This is due to both its use-case being primarily on unsupported
systems in the near-distant future and that it's reinventing the wheel that
we already have available as a feature of regex(3).

Reviewed by:	cem, emaste, ngie
Approved by:	emaste (mentor)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D12056
2017-08-24 01:23:33 +00:00
kevans
6dbe6995bd bsdgrep: cast pmatch.rm_so to fix build when linking against libgnuregex
Reported by:	many
Approved by:	emaste (mentor)
MFC after:	immediate
2017-08-17 13:40:45 +00:00
ngie
d26727d972 Add HAS_TESTS to all Makefiles that are currently using the
`SUBDIR.${MK_TESTS}+= tests` idiom.

This is a follow up to r321912.
2017-08-02 08:50:42 +00:00
ngie
734d081ed1 MFhead@r321912 2017-08-02 08:38:36 +00:00
kevans
e1afa740b3 bsdgrep(1): Don't exit before processing every file
Given an empty pattern (i.e. grep "" A B), bsdgrep(1) would previously exit()
with the appropriate exit code upon encountering an empty file. Likely intended
as an optimization, but this behavior is technically incorrect since an empty
pattern should match every line.

PR:		220924
Reviewed by:	emaste, cem (earlier version), ngie
Approved by:	emaste (mentor)
Differential Revision:	https://reviews.freebsd.org/D11698
2017-07-25 01:50:37 +00:00
kevans
5d0b504e1b Update copyright e-mail address to @FreeBSD.org address
Approved by:	emaste (mentor)
Differential Revision:	https://reviews.freebsd.org/D11508
2017-07-06 19:53:30 +00:00
bdrewery
6c4b8e235b Utilize SYSROOT from r320119 in places where DESTDIR may be wanting WORLDTMP.
Since buildenv exports SYSROOT all of these uses will now look in
WORLDTMP by default.

sys/boot/efi/loader/Makefile
        A LIBSTAND hack is no longer required for buildenv.

MFC after:	2 weeks
Sponsored by:	Dell EMC Isilon
2017-06-19 20:47:24 +00:00
emaste
2298c6d8b6 bsdgrep: bump version number and add Kyle Evans copyright
The following changes have been made over the last couple of months:

Features:

 - With bsdgrep -r, the working directory is implied if no directory is
   specified
 - bsdgrep will now behave as bsdgrep -r does when it's named rgrep
 - bsdgrep now understands -z/--null-data to use \0 as EOL
 - GNU regex compatibility is now indicated with a "GNU compatible" in
   the version string

Fixes:

 - --mmap no longer hangs when coming across an EOF without an
   accompanying EOL
 - -o/--color matching generally improved, now produces earliest /
   longest matches
 - Context output now more closely aligns with GNU grep
 - Zero-length matches no longer exhibit broken behavior
 - Every output line now honors -b/-H/-n flags

Tests have been added for previous regressions as well as other
previously untested behaviors.

Various other fixes have been commited, and refactoring for further /
later improvements has taken place.

(The original submission changed the version string to 2.5.2, but I
decided to use 2.6.0 to reflect the addition of new features.)

Submitted by:	Kyle Evans <kevans91@ksu.edu>
Differential Revision:	https://reviews.freebsd.org/D10982
2017-05-29 13:10:01 +00:00
ngie
17c97095fe :rgrep : use atf-check to check the exit code/save the output of grep -r instead
of calling grep -r without it, and saving the output to a file

This ensures that any errors thrown via grep -r are caught, not lost, and uses
existing atf-sh idioms for saving files.

Tested with:	bsdgrep, gnu grep (base, ports)
Sponsored by:	Dell EMC Isilon
2017-05-27 22:40:20 +00:00
emaste
a6bc4f125a bsdgrep: use safer sizeof() construct
Submitted by:	Kyle Evans <kevans91@ksu.edu>
2017-05-26 03:35:59 +00:00
emaste
6b50dcbb8c bsdgrep: correct assumptions to prepare for chunking
Correct a couple of minor BSD grep assumptions that are valid for line
processing but not future chunk-based processing.

Submitted by:	Kyle Evans <kevans91@ksu.edu>
Reviewed by:	bapt, cem
Differential Revision:	https://reviews.freebsd.org/D10824
2017-05-26 02:30:26 +00:00
emaste
5249e4567c bsdgrep: Correct per-line line metadata printing
Metadata printing with -b, -H, or -n flags suffered from a few flaws:

1) -b/offset printing was broken when used in conjunction with -o

2) With -o, bsdgrep did not print metadata for every match/line, just
   the first match of a line

3) There were no tests for this

Address these issues by outputting this data per-match if the -o flag is
specified, and prior to outputting any matches if -o but not --color,
since --color alone will not generate a new line of output for every
iteration over the matches.

To correct -b output, fudge the line offset as we're printing matches.

While here, make sure we're using grep_printline in -A context.  Context
printing should *never* look at the parsing context, just the line.

The tests included do not pass with gnugrep in base due to it exhibiting
similar quirky behavior that bsdgrep previously exhibited.

Submitted by:	Kyle Evans <kevans91@ksu.edu>
Reviewed by:	cem
Differential Revision:	https://reviews.freebsd.org/D10580
2017-05-20 11:20:03 +00:00
emaste
3ea00bb93c bsdgrep: emit more than MAX_LINE_MATCHES per line
We should not set an arbitrary cap on the number of matches on a line,
and in any case MAX_LINE_MATCHES of 32 is much too low.  Instead, if we
match more than MAX_LINE_MATCHES, keep processing and matching from the
last match until all are found.

For the regression test, we produce 4096 matches (larger than we expect
we'll ever set MAX_LINE_MATCHES) and make sure we actually get 4096
lines of output with the -o flag.

We'll also make sure that every distinct line is getting its own line
number to detect line metadata not being printed as appropriate along
the way.

PR:		218811
Submitted by:	Kyle Evans <kevans91@ksu.edu>
Reported by:	jbeich
Reviewed by:	cem
Differential Revision:	https://reviews.freebsd.org/D10577
2017-05-20 03:51:31 +00:00
emaste
f7fff297c0 bsdgrep: fix segfault with --mmap
r313948 partially fixed --mmap behavior but was incomplete.  This commit
generally reverts it and does it the more correct way- by just consuming
the rest of the buffer and moving on.

PR:		219402
Submitted by:	Kyle Evans <kevans91@ksu.edu>
Reviewed by:	cem
Differential Revision:	https://reviews.freebsd.org/D10820
2017-05-20 00:42:47 +00:00
emaste
a32ff2cabf bsdgrep: don't allow negative -A / -B / -C
Previously, when given a negative -A/-B/-C argument bsdgrep would
overflow the respective context flag(s) and exhibited surprising
behavior.

Fix this by removing unsignedness of Aflag/Bflag and erroring out if
we're given a value < 0.  Also adjust the type used to track 'tail'
context in procfile() so that it accurately reflects the Aflag value
rather than overflowing and losing trailing context.

This also fixes an inconsistency previously existing between -n and
-C "n" behavior.  They are now both limited to LLONG_MAX, to be
consistent.

Add some test cases to make sure grep errors out properly for both
negative context values as well as non-numeric context values rather
than giving bogus matches.

Submitted by:	Kyle Evans <kevans91@ksu.edu>
Reviewed by:	cem
Differential Revision:	https://reviews.freebsd.org/D10675
2017-05-15 17:51:01 +00:00
bdrewery
f7f6293381 DIRDEPS_BUILD: Update dependencies.
Sponsored by:	Dell EMC Isilon
2017-05-09 01:48:23 +00:00
emaste
dfcc5fbd50 bsdgrep: don't ouptut matches with -c, -l, -L
Refactoring done in r317703 broke -c, -l, and -L flags implying
suppression of match printing.  Fortunately this is just a matter of not
doing any printing of the resulting matches and context printing was not
broken in this refactoring.

Add some regression tests since this area may still see further
refactoring, include different context flags as well even though they
were not broken in this case.

PR:		219077
Submitted by:	Kyle kevans91@ksu.edu
Reported by:	markj
Reviewed by:	cem, ngie
Differential Revision:	https://reviews.freebsd.org/D10607
2017-05-05 17:35:05 +00:00
emaste
768ea40cd0 bsdgrep: correct uninitialized variable introduced in r317703
CID:		1374747
Submitted by:	Kyle Evans <kevans91@ksu.edu>
2017-05-03 13:47:02 +00:00
emaste
2b92fcdf22 bsdgrep: avoid use of magic number for REG_NOSPEC
Submitted by:	Kyle Evans <kevans91 at ksu.edu>
Differential Revision:	https://reviews.freebsd.org/D10420
2017-05-02 21:08:38 +00:00
emaste
736f3bcfc6 bsdgrep: fix escape map building for multibyte strings
In BSD grep, fix escape map building in the regex parser. It was
previously using memory not explicitly initialized, and the MBS escape
map was being built based on a version of the pattern with escapes
already parsed out.

This is Kyle's change, but I restored the broken style that already
exists in this file.

Submitted by:	Kyle Evans <kevans91 at ksu.edu>
Reviewed by:	cem, Kyle Evans (my style changes)
Differential Revision:	https://reviews.freebsd.org/D10098
2017-05-02 20:44:06 +00:00
emaste
e028666ef0 bsdgrep: fix -w flag matching with an empty pattern
-w flag matching with an empty pattern was generally 'broken', allowing
matches to occur on any line whether or not it actually matches -w
criteria.

This fix required a good amount of refactoring to address.  procline()
is altered to *only* process the line and return whether it was a match
or not, necessary to be able to short-circuit the whole function in case
of this matchall flag. -m flag handling is moved out as well because it
suffers from the same fate as context handling if we bypass any actual
pattern matching.

The matching context (matches, mostly) didn't previously exist outside
of procline(), so we go ahead and create context object for file
processing bits to pass around.  grep_printline() was created due to
this, for the scenarios where the matches don't actually matter and we
just want to print a line or two, a la flushing the context queue and
no -o or --color specified.

Damage from this broken behavior would have been mitigated by the fact
that it is unlikely users would invoke grep -w with an empty pattern.

This was identified while checking PR 105221 for problems it this may
cause in BSD grep, but PR 105221 is *not* a report of this behavior.

Submitted by:	Kyle Evans <kevans91 at ksu.edu>
Differential Revision:	https://reviews.freebsd.org/D10433
2017-05-02 20:39:33 +00:00